How to Block or Include Your Website Content in Microsoft Bing Chat

How to Block or Include Your Website Content in Microsoft Bing Chat
Image Credit: Maginative

Microsoft is now providing publishers and webmasters with more granular control over how their content appears in Bing Chat—Microsoft's conversational AI experience powered by GPT-4.

Since its launch, the reception of Bing Chat has been largely positive. However, there have been growing concerns by some webmasters and publishers about content access and usage. Last month, OpenAI introduced GPTBot, setting a precedent in this regard by offering publishers the ability to block the OpenAI user agent via the robots.txt file.

Taking a similar approach, Microsoft is now providing greater control over content visibility in Bing Chat. However, unlike GPTBot which required a new robots.txt directive, Microsoft is building on existing web standards like the noarchive and nocache meta tags. Now, publishers can make informed decisions about how their content is utilized in Bing Chat and during the training phases of Microsoft's generative AI foundation models.

For clarity, the controls operate as follows:

Default Settings: Webmasters choosing to take no action will find their content (which lacks both nocache and noarchive tags) incorporated into Bing Chat responses. This could enhance the quality of AI-generated answers and boost content visibility on Bing Chat. Furthermore, this content can also be utilized for training Microsoft's generative AI models.

NOCACHE Tag: Content labeled with the nocache tag may feature in Bing Chat responses, but only the URL, snippet, or title will be displayed. These elements may be used in the training of Microsoft’s generative AI foundation models.

<meta name="robots" content="nocache">

NOARCHIVE Tag: Material marked with the noarchive tag will be excluded from Bing Chat responses. Furthermore, such content will not be utilized during the training of Microsoft's generative AI foundation models.

<meta name="robots" content="noarchive">

Combined Tags: If webmasters opt for both nocache and noarchive tags, Microsoft will treat the content as nocache.

<meta name="robots" content="noarchive, nocache">

Bing Specific Tags: Webmasters can also instruct Bing specifically to treat the content with nocache.

<meta name="robots" content="noarchive"> <meta name="bingbot" content="nocache">

For webmasters who want even stricter control, or to facilitate Bing Chat users in discovering paywall content, the nocache value can be combined with the noarchive tag. Microsoft assures that content labeled with either nocache or noarchive tags will still surface in Bing's traditional search results.

Microsoft's announcement comes as OpenAI reintroduces its "ChatGPT Browse" functionality, which leverages Bing to search the web. OpenAI initially rolled out ChatGPT Browse in May, allowing users to query the internet for recent information to help answer questions. However, the feature was temporarily disabled in July after OpenAI found it could sometimes inadvertently display paywalled content in full.

As OpenAI brings back this search integration, Microsoft's move to expand publisher controls over Bing Chat content visibility represents a parallel effort to provide guardrails and address emergent concerns around AI chatbots displaying copyrighted material without permission.

In the broader context, as AI platforms evolve, the dance between content publishers and AI giants continues. The crux remains in striking a delicate balance: ensuring accessibility while respecting the rights and wishes of content creators.

Let’s stay in touch. Get the latest AI news from Maginative in your inbox.