Users will share data for AI training with GitHub Copilot, unless they opt out

GitHub reports that, starting April 24, it will be changing how it uses data for its AI assistant Copilot. Interaction data from users of Copilot Free, Pro, and Pro+ will now be used by default to train and improve AI models, unless users explicitly opt out. The change does not apply to Copilot Business and Copilot Enterprise.

Neowin adds that in practice, this change amounts to an opt-out model. Users who do not take action before April 24 will be automatically included in the training program. This explicitly shifts the responsibility to the user to actively adjust privacy settings. This could potentially lead to debate regarding transparency and informed consent.

With this move, GitHub, a Microsoft subsidiary, is following a broader trend within the AI sector. Real-world data is becoming increasingly important for improving model performance. According to the company, using real interactions leads to more accurate and context-aware suggestions. It is intended to help developers write code more efficiently and securely.

The data GitHub intends to use includes, among other things, Copilot input and output, code snippets, context around the cursor position, and user feedback on suggestions. Information such as file structures and interactions with features like chat and inline suggestions may also be included. This effectively covers virtually all interactions a user has with Copilot.

Distinction Between Stored and Active Data

It is noteworthy that GitHub explicitly distinguishes between data at rest and active interactions. Content from private repositories is not used unless it is actively processed via Copilot. Once a user uses Copilot within a private repository, that interaction data may be used for model training. This applies unless the user has opted out.

Users who do not want their data to be used can disable this via the privacy settings. GitHub states that existing preferences will be respected. Users who have previously chosen not to share data for product improvement will automatically remain excluded from this new training program.

The decision is partly based on previous experiments within Microsoft, where employee interaction data was already used to improve models. According to the company, this has led to higher acceptance rates for suggestions and better performance across various programming languages. The company expects that expanding to a broader user group will reinforce this trend.

In addition, Microsoft emphasizes that the collected data may be shared with affiliated companies within its own organization, but not with external AI model providers. In doing so, the company aims to alleviate concerns about data sharing with third parties. Nevertheless, the use of developer data for training purposes remains a sensitive topic.

GitHub states that the future of AI-assisted software development depends on real-world input. By training models with actual development workflows, the company aims to further position Copilot as a reliable and productive assistant for programmers.

Also read: Criticism surrounding the integration of Grok into GitHub Copilot

Claude Code gets auto mode to reduce interruptions

Anthropic introduces auto mode for Claude Code, a new mode that automatically determines which actions are pe...

Berry Zwets 1 day ago

Stack Overflow is dead, long live cq

Mozilla AI has released cq (colloquy), an open-source knowledge-sharing system for AI coding agents. Where St...

Erik van Klinken 4 hours ago

Top story

Valkey: the open source Redis fork built for true community governance

When Redis changed from an open source BSD license to a proprietary dual license in 2024, six engineers from ...

Coen van Eenbergen 2 hours ago

Top story

Yenlo is evolving into a scalable integration partner

After years of laying the groundwork, integration and API management expert Yenlo is opening the doors to the...

Berry Zwets March 16, 2026

Expert Talks

Tech calendar

Users will share data for AI training with GitHub Copilot, unless they opt out

Distinction Between Stored and Active Data

Stay tuned, subscribe!

AI chatbots can still tell you how to make a bomb

Cisco Silicon One combines uniform chip design with specific deployments

Fujitsu brings AI and social issues together

Cisco puts AI agents on a leash and interrogates AI models

Salesforce reveals its own Agentic IT Service Platform

MuleSoft agent fabric: governing AI agents across platforms

IFS builds an industrial AI ecosystem through partnerships

NetSuite founder reveals AI transformation 5 years in the making

Infosecurity Europe announces first wave of keynote speakers for 2026

Better connected business technology is essential for prosperity in the Netherlands

The zero-drift frontier: modern edge demands on Kubernetes

When is an SBOM not an SBOM? CISA’s Minimum Elements

De IT Afdeling van de toekomst

GITEX ASIA 2026

GITEX ASIA 2026

Southeast Asia AI Application Summit 2026

SAS Innovate 2026

Team '26

Experience Synology’s latest enterprise backup solution

How to choose the right Enterprise Linux platform?

Enhance your data protection strategy for 2025

Strengthen your cybersecurity with DNS best practices