Episode 13: SQL Basics for Governance
The conference room was quiet except for the clicking of keyboards. Around me, my colleagues were quickly writing queries to extract and analyze data from the bank's systems. Meanwhile, I was staring at my screen, trying to remember the difference between INNER JOIN and LEFT JOIN, and wondering if I would ever catch up.
Taking the Initiative
When Tola announced a "SQL Workshop for Governance Practitioners," I had immediately signed up. After my work on the enterprise lineage project, I had repeatedly run into situations where I needed to validate data flows by examining the actual data. Each time, I had to ask one of my technical colleagues for help. It was becoming clear that without some SQL skills, I would always be dependent on others for data access and validation.
The Learning Curve
"SQL is the language of data"
"SQL is the language of data," Tola had said. "You don't need to become a database expert, but understanding the basics will make you more effective in governance."
Workshop Challenges
Now, halfway through the workshop, I was feeling that familiar mix of frustration and determination. The syntax looked logical when our instructor, Emeka from the Data Engineering team, demonstrated it, but when I tried to write my own queries, the errors piled up.
The Governance-SQL Connection
1
Data Quality Validation
Testing if data meets quality standards through direct queries
2
Metadata Verification
Comparing system metadata with documented metadata
3
Access Control Auditing
Reviewing who has access to sensitive data
4
Lineage Validation
Confirming data transformations match what's documented
The workshop had begun with Tola explaining why SQL skills mattered for governance work:
"Governance isn't just about policies and frameworks," she said. "It's about ensuring those policies are reflected in how data actually behaves. SQL gives you the ability to check if reality matches expectations."
"When you can query the data yourself," Tola emphasized, "you have direct evidence, not just someone else's word for it."
Starting with the Basics
SELECT
Choosing what data you want to see
FROM
Specifying where to find the data
WHERE
Filtering for just what you need
Emeka began the hands-on portion with fundamental SQL concepts. He provided a sandbox database with sample bank data for us to practice on.
"SELECT, FROM, and WHERE are the building blocks of SQL," he explained. "Think of SELECT as choosing what data you want to see, FROM as specifying where to find it, and WHERE as filtering for just what you need."
He demonstrated a simple query:
SELECT customer_id, account_type, balance FROM accounts WHERE balance > 1000000;
"This query shows high-value accounts," he explained. "Already, from a governance perspective, you can see how this might be useful for identifying data that requires special handling under our classification policies."
My first attempts at writing queries were clumsy but successful. I managed to retrieve customer information and account details using simple SELECT statements. Each small victory boosted my confidence.
Joining the Data Governance Dots
1
Customer Data
Information about bank customers
2
JOIN Operation
Connects related tables
3
Account Data
Details of customer accounts
4
Complete Picture
Comprehensive view for governance
Things got more interesting when Emeka introduced JOIN operations, explaining how to connect related tables:
"In governance, you'll often need to connect different datasets to get the complete picture," he said. "For example, connecting customer data with transaction data to check if data quality issues in one affect the other."
He demonstrated with a query that joined customer information with their accounts:
SELECT c.customer_name, c.customer_type, a.account_type, a.balance FROM customers c INNER JOIN accounts a ON c.customer_id = a.customer_id WHERE c.customer_type = 'Corporate';
"This shows all corporate customer accounts," Emeka explained. "From a governance perspective, you might use this to verify that corporate accounts are flagged correctly for regulatory reporting."
I practiced with increasingly complex joins, gradually becoming more comfortable with the syntax. By connecting customer, account, and transaction tables, I could start to see how data flowed through the bank's systems—the very lineage I had been documenting was now visible through queries.
Real-World Governance Applications
Data Quality Checks
Tola demonstrated how SQL could identify data quality issues:
-- Finding missing values in critical fields SELECT COUNT(*) AS missing_email_count FROM customers WHERE email IS NULL OR email = ''; -- Identifying inconsistent date formats SELECT transaction_date, COUNT(*) FROM transactions GROUP BY transaction_date ORDER BY transaction_date;
"These simple queries can reveal data quality issues that might affect reporting or customer service," she explained. "As governance professionals, we need to ensure critical data is complete and consistent."
Classification Validation
1
Identify Potentially Sensitive Data
SELECT column_name, table_name FROM information_schema.columns WHERE column_name LIKE '%passport%' OR column_name LIKE '%ssn%' OR column_name LIKE '%birth%';
2
Check Proper Classification
SELECT t.table_name, c.column_name FROM information_schema.tables t JOIN information_schema.columns c ON t.table_name = c.table_name LEFT JOIN data_classification dc ON t.table_name = dc.table_name AND c.column_name = dc.column_name WHERE (c.column_name LIKE '%passport%' OR c.column_name LIKE '%ssn%') AND dc.classification_level IS NULL;
3
Take Action on Findings
"These queries help us find potentially sensitive data that hasn't been properly classified," Tola explained. "This is crucial for compliance with privacy regulations."
Lineage Validation

1

1
Source Data
Original transaction data

2

2
Transformation
Data processing rules

3

3
Target Data
Reporting transaction data

4

4
Validation
Verify correct transformation
Tola then demonstrated how SQL could validate data lineage:
-- Comparing source and target data to verify transformations SELECT COUNT(*) AS source_count FROM source_transactions; SELECT COUNT(*) AS target_count FROM reporting_transactions; -- Checking if transformation rules are applied correctly SELECT source.customer_segment, AVG(source.transaction_amount) AS source_avg, AVG(target.transaction_amount) AS target_avg FROM source_transactions source JOIN reporting_transactions target ON source.transaction_id = target.source_transaction_id GROUP BY source.customer_segment;
"These queries let you verify if data is being transformed correctly as it moves between systems," Tola said. "This helps validate the lineage maps we create."
My Breakthrough Moment

1

2

3

4

1
Understanding the Problem
Focus on what you want to accomplish
2
Breaking It Down
Step-by-step approach to the solution
3
Writing the Query
Translating the steps into SQL
4
Finding Real Issues
Discovering actual governance gaps
As the workshop continued, I struggled with more complex queries. During a break, I expressed my frustration to Tola.
"I understand the concepts, but I keep making syntax errors," I admitted.
"That's normal," she reassured me. "SQL is a skill that improves with practice. Focus on what you want to accomplish, then figure out the SQL to get there."
Taking her advice, I focused on a real governance question: Were all high-value accounts properly classified as "restricted" in our system? This was relevant to our data classification policy, which required special handling for accounts over certain thresholds.
I broke the problem down step by step:
  1. Find accounts with high balances
  1. Join with the classification table
  1. Identify any that weren't properly classified
Finding Real Governance Issues

1

1
Write the Query
After several attempts and help from Emeka, I produced a working query

2

2
Run the Analysis
Execute the query against the database

3

3
Discover Issues
Found 28 high-value accounts not properly classified

4

4
Take Action
Identify a compliance gap requiring correction
SELECT a.account_id, a.account_type, a.balance, COALESCE(dc.classification_level, 'Unclassified') AS classification FROM accounts a LEFT JOIN data_classification dc ON a.account_id = dc.data_element_id WHERE a.balance > 1000000 AND (dc.classification_level IS NULL OR dc.classification_level != 'Restricted');
When I ran the query, it returned 28 high-value accounts that weren't properly classified as restricted. This was a real governance finding that would require action!
I showed Tola my results, and she smiled. "This is exactly why governance professionals need SQL skills," she said. "You've just identified a compliance gap that might have gone unnoticed."
That moment changed my perspective on SQL. It wasn't just a technical skill—it was an investigative tool that gave me direct access to evidence about how well our governance policies were implemented.
Building a Governance Query Library
By the end of the workshop, Tola encouraged each of us to develop a personal "governance query library"—a collection of useful SQL queries for common governance tasks.
"Over time, you'll develop queries for tasks you perform regularly," she explained. "These become valuable tools you can reuse and share with colleagues."
I started my library with queries for:
  • Finding potentially sensitive data columns
  • Checking for data quality issues in critical tables
  • Verifying access controls on restricted data
  • Validating transformation rules in our data pipelines
Each query represented not just SQL syntax but a governance question I could now answer independently.
Reflections on SQL for Governance

1

2

3

4

1
Independence
Direct investigation without dependencies
2
Verification
Evidence-based governance
3
Connection
Linking governance areas
4
Foundation
SQL as a core governance skill
That evening, updating my learning journal, I reflected on how this new skill connected to everything else I had learned:
SQL gives me the ability to verify what I previously had to take on faith. When reviewing metadata, I can now check if it actually matches the database. When assessing data quality, I can directly measure it rather than relying solely on reports. When validating lineage, I can test if data transformations work as documented.
I don't need to be a SQL expert—I just need enough skill to independently investigate governance questions. It's like learning to read in a world where I previously needed others to read to me.
I also noted how SQL connected to the broader governance picture:
- Data Quality: SQL helps validate if quality rules are being followed
- Metadata Management: SQL can verify if metadata is accurate
- Data Classification: SQL can identify misclassified or unclassified data
- Lineage: SQL can test if transformations match documentation
- Access Control: SQL can audit who has access to what
Crossing the Threshold
New Independence
My SQL skills were basic, and I knew I had much more to learn. But I had crossed an important threshold—from being dependent on others for data access to having the ability to directly investigate governance issues.
Bridging Theory and Practice
As I closed my laptop, I felt a new sense of empowerment. Governance wasn't just about creating policies and frameworks; it was about verifying they were properly implemented. And with SQL, I now had a powerful tool to do just that.
Taking Action
Tomorrow, I would put my new skills to work by investigating those misclassified high-value accounts. It was time to close the gap between governance theory and practice—one query at a time.
Key Takeaways
1
Investigative Tool
SQL is a vital investigative tool for governance professionals to verify policies are implemented correctly
2
Direct Access
Direct data access reduces dependency on others and provides firsthand evidence
3
Governance Support
SQL supports governance work through data quality validation, metadata verification, classification auditing, and lineage validation
4
Query Library
Building a governance query library creates reusable tools for common governance tasks
5
Bridge the Gap
SQL bridges the gap between governance theory and practical implementation
Discussion Questions
How has access to data querying capabilities improved your governance work?
Share your experiences with using SQL or other querying tools to enhance your data governance practices. How has direct access to data changed your approach to governance challenges?
What governance questions could you answer with SQL that are difficult to address otherwise?
Consider specific governance scenarios where SQL provides unique insights that would be challenging to obtain through other means. What types of compliance or quality issues become more visible?
How do you balance technical skills like SQL with the policy and framework aspects of governance?
Discuss strategies for integrating technical validation with the more traditional policy development aspects of governance. How do these different approaches complement each other?