Publicly Accessible Data and AI Analysis

Beyond its role in training artificial intelligence, Text and Data Mining (TDM) and data analytics have become essential utilities across both the public and private sectors. Today, the analysis of large-scale datasets drives critical applications in search, security, and product development, while accelerating advances in manufacturing, healthcare, education, and cybersecurity. Because these diverse activities rely heavily on publicly accessible data (often perceived as a legally safe and cost-effective foundation) exploring the actual prevalence of TDM and the evolving realities of its legal risk is increasingly urgent. To navigate this evolving landscape, the Open Data Charter (ODC) is thrilled to officially announce our latest research project, supported by Microsoft: “Publicly Accessible Data: Barriers and Opportunities for Putting Data to work”

1. Project underway: exploring the landscape of publicly accessible data and AI analysis

This project examines the legal operational safeguards, and public perceptions of using publicly accessible data for AI analysis, focusing on three target jurisdictions: Brazil, Japan, and Australia (though we are also exploring insights globally). We explore how different countries interpret exceptions and limitations for Text and Data Mining (TDM) and AI analysis and how this impacts approaches to analyzing data to better understand opportunities and risks of policy interventions.

With this new initiative, we will continue deepening our focus toward developing research, actionable recommendations, and best practices for the responsible use of data in AI systems. Our core mission at ODC has always been to foster collaboration among over 250 governments and organizations to promote well-governed data practices.

Through this work, we will be producing a suite of resources aimed at equipping developers and institutions with the clarity and tools needed to responsibly leverage data in the innovation ecosystem.

2. Summit recap: first public presentation in India

We recently took this project on the road and on the global stage for its very first public presentation. On February 16, 2026, ODC hosted a specialized session at the India AI Impact Summit 2026 titled “When Copyright meets Data Analytics: Safeguards for Responsible Reuse”.

The session brought together expert speakers, including Fola Adeleke (Global Center on AI Governance), Vinay Narayan (Aapti), and Violeta Belver (ILDA), who shared crucial regional insights as follows:

The jurisdiction spectrum: our early desk research revealed vast differences. For example, Japan stands out as highly permissive due to clear copyright exceptions for information analysis. Meanwhile, Brazil currently lacks explicit TDM exceptions (though it is being debated in their upcoming AI Bill), and Australia relies strictly on existing “fair dealing” frameworks and specific licensing.

The global majority perspective: Violeta Belver (ILDA) emphasized that “publicly accessible” does not mean “free from limitations”. She warned that mass reuse of historical administrative data risks reproducing structural inequalities and algorithmic biases against marginalized groups if proper human rights safeguards aren’t in place.

The governance ambiguity: Vinay Narayan (Aapti) highlighted fragmented governance approaches and the urgent need for clarity regarding who accesses data, pointing out that even when data is technically public (such as on social media), caution is paramount.

Insights from the room

Through interactive polling with our diverse audience of academics, civil society, government officials, and private sector leaders, we discovered that while public data is “very important” to their work, major frictions remain. Chief among these challenges are a lack of clear TDM exceptions across borders and poor, inconsistent metadata which makes auditing datasets for bias incredibly difficult.

To bridge these gaps, some safeguards were initially brainstormed, such as documenting purpose, and clarifying licensing.

3. We want to hear from you: take the survey!

The India AI Impact Summit was just the beginning. The insights gathered there are directly informing our future research agenda, and we are now moving into the deeper, participatory phase of the project.

We have officially launched our global stakeholder survey, and we need your expertise.

To guarantee the quality and real-world relevance of our final recommendations, we are gathering broader insights into current practices, perceptions, and challenges surrounding data reuse.

Who should respond?

We are looking for input from: Institutional representatives and government technical experts; Civil society organizations specializing in the intersection of data and AI; Members of academia and dedicated research centers; The private sector actively working with publicly accessible data.

⏱️ Time commitment: the questionnaire takes approximately 15 minutes to complete.

🗓️ Deadline: please complete the survey by June 17th, 2026.

Thank you in advance for your collaboration. We look forward to learning from your experience!

Link to the survey in English: https://forms.gle/mTqGn2kXqyEdQAFX7Link to the survey in Japanese: https://forms.gle/mKZQvGWz1YKw9bAU8Link to the survey in Portuguese: https://forms.gle/MPg1tU6UsXagqe4M9

This article was first published by the Open Data Charter on Medium: https://medium.com/opendatacharter/publicly-accessible-data-and-ai-analysis-65b20e3b29d1