Content Scripts

Content scripts are JavaScript files that run in the context of web pages. By using the standard Document Object Model (DOM), they can read details of the web pages the browser visits, or make changes to them.

Contents

  1. Understand Content Script Capabilities
  2. Manifest
    1. Match patterns and globs
  3. Programmatic injection
  4. Execution environment
  5. Communication with the embedding page
  6. Security considerations
  7. Referring to extension files

Understand Content Script Capabilities

Content scripts can access Chrome APIs used by their parent extension by exchanging messages and access information by making cross-site XMLHttpRequests to parent sites. They can also access the URL of an extension’s file with chrome.runtime.getURL() and use the result the same as other URLs.

//Code for displaying extensionDir/images/myimage.png:
var imgURL = chrome.runtime.getURL("images/myimage.png");
document.getElementById("someImage").src = imgURL;

Additionally, content script can access the following chrome APIs directly:

Content scripts are unable to access other APIs directly.

Opera implements an additional privacy protection mechanism. By default, extensions are not allowed to access and manipulate search results provided by most built-in engines. Users can give access to search page results on the Extensions list:

Allow access to search page results

Manifest

If your content script’s code should always be injected, register it in the extension manifest using the content_scripts field, as in the following example.

{
	"name": "My extension",
	"content_scripts": [
		{
			"matches": ["http://www.google.com/*"],
			"css": ["mystyles.css"],
			"js": ["jquery.js", "myscript.js"]
		}
	]
}

If you want to inject the code only sometimes, use the permissions field instead, as described in Programmatic injection.

{
	"name": "My extension",
	"permissions": [
		"tabs", "http://www.google.com/*"
	]
}

Using the content_scripts field, an extension can insert multiple content scripts into a page; each of these content scripts can have multiple JavaScript and CSS files. Each item in the content_scripts array can have the following properties:

NameTypeDescription
matchesarray of stringsRequired. Specifies which pages this content script will be injected into. See Match Patterns for more details on the syntax of these strings and Match patterns and globs for information on how to exclude URLs.
exclude_matchesarray of stringsOptional. Excludes pages that this content script would otherwise be injected into. See Match Patterns for more details on the syntax of these strings and Match patterns and globs for information on how to exclude URLs.
cssarray of stringsOptional. The list of CSS files to be injected into matching pages. These are injected in the order they appear in this array, before any DOM is constructed or displayed for the page.
jsarray of stringsOptional. The list of JavaScript files to be injected into matching pages. These are injected in the order they appear in this array.
run_atstringOptional. Controls when the files in js are injected. Can be document_start, document_end, or document_idle. Defaults to document_idle. In the case of document_start, the files are injected after any files from css, but before any other DOM is constructed or any other script is run. In the case of document_end, the files are injected immediately after the DOM is complete, but before subresources like images and frames have loaded. In the case of document_idle, the browser chooses a time to inject scripts between document_end and immediately after the window.onload event fires. The exact moment of injection depends on how complex the document is and how long it is taking to load, and is optimized for page load speed. Note: With document_idle, content scripts may not necessarily receive the window.onload event, because they may run after it has already fired. In most cases, listening for the onload event is unnecessary for content scripts running at document_idle because they are guaranteed to run after the DOM is complete. If your script definitely needs to run after window.onload, you can check if onload has already fired by using the document.readyState property.
all_framesbooleanOptional. Controls whether the content script runs in all frames of the matching page, or only the top frame.

Defaults to false, meaning that only the top frame is matched.
include_globsarray of stringOptional. Applied after matches to include only those URLs that also match this glob. Intended to emulate the @include Greasemonkey keyword. See Match patterns and globs below for more details.
exclude_globsarray of stringOptional. Applied after matches to exclude URLs that match this glob. Intended to emulate the @exclude Greasemonkey keyword. See Match patterns and globs below for more details.

Match patterns and globs

The content script will be injected into a page if its URL matches any matches pattern and any include_globs pattern, as long as the URL doesn’t also match an exclude_matches or exclude_globs pattern. Because the matches property is required, exclude_matches, include_globs, and exclude_globs can only be used to limit which pages will be affected.

For example, assume matches is ["http://*.nytimes.com/*"]:

  • If exclude_matches is ["*://*/*business*"], then the content script would be injected into http://www.nytimes.com/health but not into http://www.nytimes.com/business.
  • If include_globs is ["*nytimes.com/???s/*"], then the content script would be injected into http:/www.nytimes.com/arts/index.html and http://www.nytimes.com/jobs/index.html but not into http://www.nytimes.com/sports/index.html.
  • If exclude_globs is ["*science*"], then the content script would be injected into http://www.nytimes.com but not into http://science.nytimes.com or http://www.nytimes.com/science.

Glob properties follow a different, more flexible syntax than match patterns. Acceptable glob strings are URLs that may contain “wildcard” asterisks and question marks. The asterisk * matches any string of any length (including the empty string); the question mark ? matches any single character.

For example, the glob http://???.example.com/foo/* matches any of the following:

  • http://www.example.com/foo/bar
  • http://the.example.com/foo/

However, it does not match the following:

  • http://my.example.com/foo/bar
  • http://example.com/foo/
  • http://www.example.com/foo

Programmatic injection

Inserting code into a page programmatically is useful when your JavaScript or CSS code shouldn’t be injected into every single page that matches the pattern — for example, if you want a script to run only when the user clicks a browser action’s icon.

To insert code into a page, your extension must have cross-origin permissions for the page. It also must be able to use the chrome.tabs module. You can get both kinds of permission using the manifest file’s permissions field.

Once you have permissions set up, you can inject JavaScript into a page by calling tabs.executeScript. To inject CSS, use tabs.insertCSS.

The following code (from the make_page_red example) reacts to a user click by inserting JavaScript into the current tab’s page and executing the script.

// in background.html
chrome.browserAction.onClicked.addListener(function(tab) {
	chrome.tabs.executeScript(null,	{
		code: 'document.body.style.color = "red"'
	});
});

// in manifest.json
"permissions": [
	"tabs", "http://*/*"
],

When the browser is displaying an HTTP page and the user clicks this extension’s browser action, the extension sets the page’s color property to red. The result is that the page turns red.

Usually, instead of inserting code directly (as in the previous sample), you put the code in a file. You inject the file’s contents like this:

chrome.tabs.executeScript(
	null, {
		file: 'content_script.js'
	}
);

Execution environment

Content scripts execute in a special environment called an isolated world. They have access to the DOM of the page they are injected into, but not to any JavaScript variables or functions created by the page. It looks to each content script as if there is no other JavaScript executing on the page it is running on. The same is true in reverse: JavaScript running on the page cannot call any functions or access any variables defined by content scripts.

For example, consider this simple page:

<!-- hello.html -->
<html>
	<button id="mybutton">click me</button>
	<script>
		var greeting = 'hello, ';
		var button = document.getElementById('mybutton');
		button.person_name = 'Bob';
		button.addEventListener('click', function() {
			alert(greeting + button.person_name + '.');
		}, false);
	</script>
</html>

Now, suppose this content script was injected into hello.html:

// contentscript.js
var greeting = 'hola, ';
var button = document.getElementById('mybutton');
button.person_name = 'Roberto';
button.addEventListener("click", function() {
	alert(greeting + button.person_name + '.');
}, false);

Now, if the button is pressed, you will see both greetings.

Isolated worlds allow each content script to make changes to its JavaScript environment without worrying about conflicting with the page or with other content scripts. For example, a content script could include jQuery v1 and the page could include jQuery v2, and they wouldn’t conflict with each other.

Another important benefit of isolated worlds is that they completely separate the JavaScript on the page from the JavaScript in extensions. This allows us to offer extra functionality to content scripts that should not be accessible from web pages without worrying about web pages accessing it.

Communication with the embedding page

Although the execution environments of content scripts and the pages that host them are isolated from each other, they share access to the page’s DOM. If the page wishes to communicate with the content script (or with the extension via the content script), it must do so through the shared DOM.

An example can be accomplished using window.postMessage (or window.webkitPostMessage for Transferable objects):

// contentscript.js
var port = chrome.runtime.connect();

window.addEventListener('message', function(event) {
	// We only accept messages from ourselves
	if (event.source != window) {
		return;
	}

	if (event.data.type && (event.data.type == 'FROM_PAGE')) {
		console.log('Content script received: ' + event.data.text);
		port.postMessage(event.data.text);
	}
}, false);

// http://foo.com/example.html
document.getElementById('theButton').addEventListener('click', function() {
	window.postMessage({
		type: 'FROM_PAGE',
		text: 'Hello from the webpage!'
	}, '*');
}, false);

In the above example, example.html (which is not a part of the extension) posts messages to itself, which are intercepted and inspected by the content script, and then posted to the extension process. In this way, the page establishes a line of communication to the extension process. The reverse is possible through similar means.

Security considerations

When writing a content script, you should be aware of two security issues. First, be careful not to introduce security vulnerabilities into the web site your content script is injected into. For example, if your content script receives content from another web site (for example, by making an XMLHttpRequest), be careful to filter that content for cross-site scripting attacks before injecting the content into the current page. For example, prefer to inject content via innerText rather than innerHTML. Be especially careful when retrieving HTTP content on an HTTPS page because the HTTP content might have been corrupted by a network “man-in-the-middle” if the user is on a hostile network.

Second, although running your content script in an isolated world provides some protection from the web page, a malicious web page might still be able to attack your content script if you use content from the web page indiscriminately. For example, the following patterns are dangerous:

// contentscript.js
var data = document.getElementById('json-data')
// WARNING! Might be evaluating an evil script!
var parsed = eval('(' + data + ')')

// contentscript.js
var elmt_id = …
// WARNING! elmt_id might be '); … evil script … //'!
window.setTimeout('animate(' + elmt_id + ')', 200);

Instead, prefer safer APIs that do not run scripts:

// contentscript.js
var data = document.getElementById('json-data')
// JSON.parse does not evaluate the attacker’s scripts.
var parsed = JSON.parse(data)

// contentscript.js
var elmt_id = …
// The closure form of setTimeout does not evaluate scripts
window.setTimeout(function() {
	animate(elmt_id);
}, 200);

Referring to extension files

Get the URL of an extension’s file using chrome.extension.getURL(). You can use the result just like you would any other URL, as the following code shows.

// Code for displaying <extensionDir>/images/myimage.png:
var imgURL = chrome.extension.getURL('images/myimage.png');
document.getElementById('someImage').src = imgURL;