Google and Google Scholar can boost the worldwide visibility and accessibility of your content. We work with publishers of scholarly information to index peer-reviewed papers, theses, preprints, abstracts, and technical reports from all disciplines of research and make them searchable on Google and Google Scholar. This page provides policy and technical information for scholarly publishers and societies.
Publisher Policies
Multiple versions of a work are grouped to improve its ranking. In many research areas versions of a work may appear as preprints and conference papers before being published as a journal article. These preliminary versions of a work are often cited in addition to the authoritative journal version. The number of citations to a particular work is an important part of determining its rank in the Google Scholar search results. Grouping versions allows us to collect all citations to all versions of a work. In practice, this can significantly improve the position of an article in the search results.
Publisher's full-text, if indexed, is the primary version. When multiple versions of a work are indexed, we select the full and authoritative text from the publisher as the primary version. We can only do this if we are able to successfully identify, crawl and process the full text of the publisher’s version.
Publishers have control over access to their articles. We work with publishers to preserve their control over access to their content and only cache articles and papers that are not access-controlled. Publishers can help us by identifying which regions of their sites are access-controlled. For details, please click here.
Google users must be offered at least a complete abstract. This is a crucial component of our indexing program. For papers with access restrictions, a full author-written abstract will help users choose among the results which paper is the most likely to have the information they are looking for.
We will respond to complaints regarding copyright infringement. Our policy is to respond to all notices of alleged copyright infringement that comply with the
Digital Millennium Copyright Act. For directions and more
information, please click here.
Frequently Asked Questions
Common Questions
- I'm a publisher of scholarly works and would like to have my content included in Google and Google Scholar?
- I publish scholarly textbooks and monographs. Can my content be included in Google Scholar?
- Can I see usage statistics for my content?
- What do I do if I believe you're linking to a webpage that infringes my copyright?
Technical Questions
- My articles are in PDF format. Can you still index my site?
- How can I tell if a PDF file has searchable text?
- Some of my articles are split into multiple files, one file per section. Can you work with these?
- Articles in PDF format can be large, and it's easy for me to extract the text from each article. Can you work with just the sequence of words for each document?
- I see a 'cached' (or 'View as HTML') link for my access-controlled articles. I need to have this fixed right away!
- Is there anything I can do to help rank my articles better?
- All my articles are available to your crawlers, but not all of them seem to show up in Google Scholar. Can I do something to help improve coverage?
Common Questions
- I'm a publisher of scholarly works and would like to have my content included in Google and Google Scholar?
Your content is most welcome. If your works are already online, we may need nothing more than your permission for our crawlers to visit your site. As noted above, an abstract (at least) of each work must be available to non-subscribers who come from Google and Google Scholar. Please contact us to discuss the details.
- I publish scholarly textbooks and monographs. Can my content be included in Google Scholar?
For now, Google Scholar indexes only scholarly articles. For textbooks and monographs, we recommend Google Book Search.
- Can I see usage statistics for my content?
Since users click through to your website, your web server logs should have all the usage statistics.
- What do I do if I believe you're linking to a webpage that infringes my copyright?
It is our policy to respond to notices of alleged infringement that comply with the Digital Millennium Copyright Act. For directions and more information, please click here.
Technical Questions
- My articles are in PDF format. Can you still index my site?
Yes. We can index PDF articles as long as they're searchable. We also can index HTML, PostScript, compressed PostScript (ps.gz), and compressed PDF (pdf.gz).
- How can I tell if a PDF file has searchable text?
Open the file in Adobe Acrobat Reader. Click 'Find' (look for the binocular icon), and confirm that you can search for and find several words on the page.
- Some of my articles are split into multiple files, one file per section. Can you work with these?
Alas, we can't. We can index only one file per article at the moment.
- Articles in PDF format can be large, and it's easy for me to extract the text from each article. Can you work with just the sequence of words for each document?
We strongly recommend preserving the full PDF layout information. We rely on a document's layout to extract metadata, citations and other information which plays a significant role in relevance ranking. If document size or crawl bandwidth are issues, we can work with you to determine a suitable way to crawl your site. Please contact us.
- I see a 'cached' (or 'View as HTML') link to my access-controlled articles. I need to have this fixed right away!
Of course! Please email us with specific examples of where the links appear; we'll investigate and fix as soon as possible. This is not intentional but may happen due to technical issues. For example, our methodical crawlers may accidentally discover a forgotten alternative interface to your content. You'll need to tell us of all such interfaces, because crawlers can go places where you least expect them. Please email us and we'll look into it.
If you believe another site is infringing your copyright, please see our directions on the DMCA process.
- Is there anything I can do to help rank my articles better?
Indeed you can. Our indexing algorithms automatically extract metadata, citations and other information from articles and use them for ranking purposes. Providing authoritative metadata about your articles can help facilitate this and can increase the likelihood of identifying all the citations to your articles. We strongly recommend this approach. Please contact us if you would like to work with us on this.
- All my articles are available to your crawlers, but not all of them seem to show up in Google Scholar. Can I do something to help improve coverage?
Based on our experience, here are some suggestions:
- Make sure all your articles can be reached from your home page by following simple HTML links. Building a browse interface for your site can help, and also can help users discover the full richness of your site.
- Avoid using session IDs, cookies and other tracking parameters for our crawler. These provide useful information for users but hinder crawler operation, since multiple URLs are associated with each document.
- Provide us with a list of URLs for all your scholarly articles, along with article-level metadata for each. This facilitates both crawling and indexing.
For more details on any of these points, please don't hesitate to contact us.