Data-ish • AI in analysis
Short on time? Here's the gist.
AI will not save you time on evaluation if you do not know how to interrogate what it gives you. Without a human with a plan at the helm, AI could cost organizations more than it saves.
Navigating data alone is hard. This month, we are launching the Data-ish community, a space to go deeper than the newsletter.
Read on to learn more
Things that make you go hmmmm.
Stick these to your monitor or, better yet, scribble them on the back of a napkin.
AI Computation - Human Oversight = Flawed Interpretation
AI has no institutional memory.
It does not know that Tuesday attendance spikes because that is when the free bus route runs to your facility. It does not know that the blank cells in column D mean the intake form changed in March, not that the data is missing.
It produces a confident, well-formatted output regardless of whether the underlying logic holds up.
You need a human in the loop.
Before the tool touches anything, a human designs the plan.
Why are we using AI for this analysis? What do we actually need to know? The tool will fill the gaps with math, and the math will be confident, and the confidence will be wrong in ways that are hard to spot later.
When you prompt
A human who understands the research question can ask for what the analysis needs rather than what the organization wants to be true. How you write the prompt shapes what comes back.
When reading the output
You are the interpretation layer the tool does not have and cannot build. You are the one who determines whether the pattern the tool found reflects your program or something else entirely happening in the same timeframe.
Field Notes
The Vanishing Data Problem
AI will quietly drop incomplete rows and analyze the cleaner, smaller dataset it creates in the process.
Example
A youth mentorship organization asked an AI to analyze grade improvements among 100 students. The tool reported an average GPA increase of 0.4 points. What it did not report was that 20 students had been dropped from the analysis because their data forms had blank fields. Those 20 students had grade declines.
Reduce risk
Use a prompt that tells the tool what to do with incomplete data rather than leaving it to decide: "If there are blank cells, group them separately as Unknown/Missing rather than excluding them."
Correlation Without Context
AI finds mathematical patterns. It cannot tell you whether the pattern reflects your program or something else entirely that happened in the same timeframe.
Example
A food security nonprofit tracking donor sign-ups noticed that AI identified a spike on Thursdays in October and concluded that Thursday social media posts were the organization's most powerful fundraising tool. A local celebrity had reposted one of their social media posts on a Thursday in October. The AI had no way to know that. Staff who did not probe the conclusion began reallocating resources toward Thursday's content.
Reduce risk
When AI surfaces a trend, the follow-up prompt should be: "What are three alternative external factors that could explain this pattern, separate from the one you identified?" Then a human decides what is true.
AI wants you to be happy
If your prompt implies an expected finding, many AI tools will shape their analysis to confirm it.
Example
An organization held a workshop they believed had gone well. What they did not know was that attendance had dropped 15% from the previous session, and the post-session survey had only been completed by the participants who stayed to fill it out.
A program manager uploaded the attendance and survey data with this prompt: "We think our new weekend workshop was a success. What data points show that?" The AI found the data points that showed success because that is what it was asked to find.
Reduce risk
Keep your prompt neutral: "Analyze this data. Identify the strongest findings and the weakest findings based on the numbers. Do not optimize for positive outcomes." A human who knows the program then decides whether the findings make sense.
Engaging AI with your data checklist
✔ Clean your columns.
Every column should have a clear, consistent name.
✔ Anonymize your data.
Remove names, addresses, and identifying information before it leaves your systems.
✔ Know what your empty cells mean.
Decide before you upload. The tool will decide for you if you do not.
✔ State your context first.
Open your prompt by telling the tool who you are and what you are measuring.
✔ Ask for the methodology.
End every prompt with: "Explain step by step how you calculated these results and list any data points you excluded." Then read it. You are the one who decides whether what the tool excluded should have been.
Data-ish
Community
When I was a development director, I showed up to every training and read every article with data in the title. I took feverish notes trying to make sense of numbers I was suddenly responsible for.
I built the Data-ish newsletter for everyone who has ever found themselves in that seat. As budgets shrink and demands for data grow, a newsletter stopped feeling like enough.
The Data-ish community is for people who want to go further. Each month, members get an expanded issue, a structured activity to try with their own data, and a live Zoom to talk through what they found with each other and with me.
This month, everything is open to everyone.
Read on to see what the community is about and join our free Zoom on June 9th.
Data-ish members only content
•
Data-ish members only content •
Prompt for what you need, not what you want.
Meh: "Here is our after-school program data. We think the tutoring section is our strongest asset. Please analyze the trends."
Mwah!: "Analyze the attached spreadsheet. Identify the top two performing program segments and the bottom two performing program segments based strictly on metric improvement. Do not optimize for positive outcomes."
Meh: "Here is our attendance tracker for the year. What are the trends?"
Mwah: "Analyze this attendance tracker. Before identifying any trends, tell me how many rows contain blank or incomplete data, what percentage of the total dataset that represents, and how you handled those rows in your analysis."
Meh: "We need to show progress on our workforce development outcomes for our mid-year funder report. Here is the participant data. What can we say?"
Mwah: "Analyze this participant data against these [list outcomes]. For each outcome, tell me what the data shows, whether the evidence is strong or limited, and what qualifications a funder would reasonably expect to see in the report. Do not frame findings more positively than the data supports."
Make AI be the skeptic.
Assign AI a role that works against its agreeable default.
"You are a skeptical outside auditor hired to find flaws, missing data, and negative trends in this spreadsheet. Your success is measured by how rigorously you challenge assumptions and identify mathematical inconsistencies. Give me the harshest objective critique of this data."
Slow the model down.
"Start by looking at this data critically and identifying any discrepancies or inconsistencies, then provide the analysis." This shifts the tool's posture before it reaches any conclusions.
Members Only Activity
Test how AI's outputs differ based on the quality of your data and the prompts you provide.
Step One: Go in cold.
Take a dataset from your own work and anonymize it. Do not edit anything else about it. Feed it to an AI tool exactly as it is with this neutral prompt: Analyze this dataset. Identify trends and suggest possible next steps based on what you find."
Save the full output.
Step Two: Review/refine your data
Open the dataset and think critically about it. What do the blank cells mean? Are there duplicates? What happened during this period that is not visible in the spreadsheet? Do the column names name what is within them?
Clean up what you can.
Fix column names.
Flag or remove rows that are structurally broken.
Note what you cannot fix and why.
Feed it to the AI tool with the same prompt: "Analyze this dataset. Identify trends and suggest possible next steps based on what you find."
Save the output.
Step Three: Change the prompt
Rewrite your prompt using techniques from this issue.
Tell the tool explicitly how to handle incomplete or missing data.
Assign it the adversarial auditor persona. Run the analysis again on the cleaned dataset.
Ask it to review the data before analyzing it
Save the output and compare all three.
Did the 1st or 2nd step provide misleading findings?
What did the data cleaning change?
How did the prompt change the output?
Bring your outputs and experience to our June Zoom chat.