How to Search Using PubMed and Other Life Science Databases

COVID19 - CORONAVIRUS, 27 Jul 2020

Robert G. Smith, PhD | Orthomolecular Medicine News Service - TRANSCEND Media Service

25 Jul 2020 – The PubMed database is widely used to find health-related articles about a wide variety of topics. It references articles from hundreds of journals, both domestic USA and international. PubMed contains citations and information from life science journals and online books originally compiled into the MEDLINE database by the US National Library of Medicine, some originally published as far back as the nineteenth century. PubMed and several other life-science databases are run by the National Center for Biotechnology Information (NCBI).

Is your search a complete search?

Recently a “new” version of the PubMed search page has been designed that is supposed to be easier to use. [1] However some of the features of the previous “legacy” PubMed search page appear on first sight to be missing. [2] For example, a pull-down menu that allows the user to select other databases has been removed. In its place, access to the other databases is available by a simple click at the bottom of the PubMed page. In addition, in both the old and new versions of PubMed, the user can select specific combinations of search terms by clicking on “Advanced” just below the search box. This feature makes PubMed very powerful because it enables searches that can be set with very specific search terms, such as the first or last author, the journal, title, or supplementary concepts. When you click on any paper to view its page, PubMed provides a list of “similar” articles, and also a list of citations (i.e. other papers that refer to the paper you’re looking at).

Other widely used databases available at the bottom of the PubMed page include PubMed Central, [3] Europe PMC, [4] PubChem, GENE, Bookshelf, and others. Several of these databases, including PubMed Central, include “Advanced” buttons that allow the user to select which search terms to use.

Many articles are free access

The PubMed Central (PMC) database is a subset of the PubMed database that contains exclusively full-text articles that are available free for download. [3] A law passed by Congress in 2008 stipulates that articles published through NIH-funded research must be submitted to PMC for free public access: https://grants.nih.gov/grants/guide/notice-files/NOT-OD-08-033.html The law allows journals 6 – 12 months of exclusive publication through a pay-wall, after which the articles must be available free to the public.

Although many papers published before 2008 are not available in PMC, some are. Articles published before 2008 can be submitted to PMC by the copyright owner (i.e. the author or the journal) for free public access if their free publication is allowed by the journal.

Is there bias built into your search engine?

Although PubMed is designed to be easy to use for beginners, to understand how it performs a search given the search words entered by the user can be daunting. The results of a PubMed search can include those from PMC and online journals. But the default search terms are not always obvious and often may seem to ignore some controversial topics and journals. For many early articles, only the authors and title are available to PubMed; for others, the abstract is available but the full text is not.

In contrast, since PMC comprises full-text articles, it has access to the complete full text and References section. Therefore its default search terms normally include the References section. Articles located by PMC often include those that contain the search words only in the References (i.e. not in the main text of the article). Evidently for some articles available only in pdf form, the References section is scanned in as part of the full text of the article. When searching PMC only for author’s name, it will not merely return articles as PubMed does that only contain the name in the article’s author list — in some cases it will also return articles that contain the author in the References section. Evidently, if PMC finds papers in its full-text database with the named author in the author list, it searches exclusively using the author list. But if it finds no papers with the named author in the author list, it uses alternate search terms that include the References section. However, since PMC only includes a subset of PubMed papers, when searched with an author’s name that is in the author list of some of its papers, it may not return as many hits as PubMed.

If you go to the “new” PubMed and scroll to the bottom, a variety of search paths are available with a click: [1]

Popular

  • PubMed
  • PubMed Central
  • Bookshelf
  • PubChem
  • Gene
  • BLAST
  • Nucleotide
  • Protein
  • GEO

Resources

  • Literature
  • Health
  • Genomes
  • Genes
  • Proteins
  • Chemicals

And there are several more general search categories, some of which are “powered by Bing”, i.e. they use bing.com to search in their sub-category:

NLM | NIH | HHS | USA.gov

The “Health” category includes by default all databases, but there is a pull-down menu that allows the user to select a particular one. They comprise different databases of articles and different default search terms.

From my experience, PMC, though often providing many hits, is by default set up to return a wider search than PubMed because of its default search terms on articles’ full text. The default PMC searches can include the References section of its articles — which for some searches can produce more hits. This simply reflects the contents of the PMC database which in many cases originate from a scan of the original pdf file.

The PMC database includes all the full-text articles published by NIH-funded research, so in PMC searches one often gets many articles where the search words are in the Reference section. This is not so obvious when one searches for a general term such as “low carbohydrate” since the hits all show these words somewhere. But when a PMC search is performed on an author’s name, very often the hits are papers that don’t include the author. The reason is that PMC contains full-text articles so the references are easily available. This is quite obvious when you do a search by author name but not so obvious when you search for a general phrase.

Comparison of databases

In comparing PubMed and PMC searches, using somewhat general phrases and comparing the total numbers from the different databases, one may get the impression that PMC does a “better” search because it returns more articles. But when doing a search that includes the name of an author, the rationale becomes more obvious. In many cases, the PubMed search will only return articles that include the name entered as an author — in some cases, just a few articles. But the identical search text on PMC returns many more articles. The reason is that PMC has the complete text and Reference section of every article, so it can return articles that contain references to the author name entered in the search. When evaluating the databases using a general phrase in the search, it is difficult to see this pattern, but when using an author name in the search, it is immediately obvious when the author name given is not present in the authors of the search results — because they’re listed by author!

Both types of searches are appropriate uses for the different databases, but only when one understands the basic data available for the searches can one get a handle on which database to use. PubMed, though its database includes a wider selection of articles than PMC, can’t always access the full text, so the References are evidently not included by default in the search. Europe PMC is similar to PubMed Central, i.e. it contains recently published freely downlooadable articles, can include Reference searches, but also includes searches of the PubMed database that do not necessarily have full text content. [4] Google and Duckduckgo can find some of the pdfs but will list articles that only have an abstract or just authors and a title — so they have a mix of search terms. [5,6] Google Scholar searches PubMed, PMC, Europe PMC, and also the entire internet, often producing results from a very wide assortment of articles, books, and online pages. [7]

Let’s try one

It helps to know what each database contains and what the default search terms are. A simple search shows the main point:

PubMed

Let’s do a search for vitamin C specialist Frederick Robert Klenner, MD. Search for “klenner vitamin c” on PubMed, and one gets 4 of his original articles. Evidently the name “klenner” is treated as an author:

  1. The treatment of poliomyelitis and other virus diseases with vitamin C.
    KLENNER FR. South Med Surg. 1949 Jul;111(7):209-14. PMID: 18147027 No abstract available.
  2. Virus pneumonia and its treatment with vitamin C.
    KLENNER FR. South Med Surg. 1948 Feb;110(2):36-8. PMID: 18900646 No abstract available.
  3. Massive doses of vitamin C and the virus diseases.
    KLENNER FR. South Med Surg. 1951 Apr;113(4):101-7. PMID: 14855098 No abstract available.
  4. The vitamin and massage treatment for acute poliomyelitis.
    KLENNER FR. South Med Surg. 1952 Aug;114(8):194-7. PMID: 12984224

PubMed Central

Then click on “PubMed Central” at the bottom of the PubMed page, and search for “klenner vitamin c”, and one gets 9 fairly recent articles, none of which are Kenner’s original reports, but which mention Klenner or have Klenner papers in their reference section:

  1. Hydrocortisone, Ascorbic Acid and Thiamine (HAT Therapy) for the Treatment of Sepsis. Focus on Ascorbic Acid
    Paul E. Marik
    Nutrients. 2018 Nov; 10(11): 1762. Published online 2018 Nov 14. doi: 10.3390/nu10111762
    PMCID: PMC6265973
  2. High dose concentration administration of ascorbic acid inhibits tumor growth in BALB/C mice implanted with sarcoma 180 cancer cells via the restriction of angiogenesis
    Chang-Hwan Yeom, Gunsup Lee, Jin-Hee Park, Jaelim Yu, Seyeon Park, Sang-Yeop Yi, Hye Ree Lee, Young Seon Hong, Joosung Yang, Sukchan Lee
    J Transl Med. 2009; 7: 70. Published online 2009 Aug 11. doi: 10.1186/1479-5876-7-70
    PMCID: PMC2732919
  3. Changes of Terminal Cancer Patients’ Health-related Quality of Life after High Dose Vitamin C Administration
    Chang Hwan Yeom, Gyou Chul Jung, Keun Jeong Song
    J Korean Med Sci. 2007 Feb; 22(1): 7?11. Published online 2007 Feb 28. doi: 10.3346/jkms.2007.22.1.7
    PMCID: PMC2693571
  4. Pharmacogenomic Characterization and Isobologram Analysis of the Combination of Ascorbic Acid and Curcumin: Two Main Metabolites of Curcuma longain Cancer Cells
    Edna Ooko, Onat Kadioglu, Henry J. Greten, Thomas Efferth
    Front Pharmacol. 2017; 8: 38. Published online 2017 Feb 2. doi: 10.3389/fphar.2017.00038
    PMCID: PMC5288649
  5. Ascorbic acid inhibits replication and infectivity of avian RNA tumor virus.
    M J Bissell, C Hatie, D A Farson, R I Schwarz, W J Soo
    Proc Natl Acad Sci U S A. 1980 May; 77(5): 2711?2715. doi: 10.1073/pnas.77.5.2711
    Correction in: Proc Natl Acad Sci U S A. 1981 Sep; 78(9): 5917.
    PMCID: PMC349473
  6. Patterns of vitamin C intake from food and supplements: survey of an adult population in Alameda County, California.
    L R Shapiro, S Samuels, L Breslow, T Camacho
    Am J Public Health. 1983 Jul; 73(7): 773?778. doi: 10.2105/ajph.73.7.773
    PMCID: PMC1650902
  7. Suppression of human immunodeficiency virus replication by ascorbate in chronically and acutely infected cells.
    S Harakeh, R J Jariwalla, L Pauling
    Proc Natl Acad Sci U S A. 1990 Sep; 87(18): 7245?7249. doi: 10.1073/pnas.87.18.7245
    PMCID: PMC54720
  8. Inhibition of AcpA Phosphatase Activity with Ascorbate Attenuates Francisella tularensis Intramacrophage Survival
    Steven McRae, Fernando A. Pagliai, Nrusingh P. Mohapatra, Alejandro Gener, Asma Sayed Abdelgeliel Mahmou, John S. Gunn, Graciela L. Lorca, Claudio F. Gonzalez
    J Biol Chem. 2010 Feb 19; 285(8): 5171?5177. Published online 2009 Dec 22. doi: 10.1074/jbc.M109.039511
    PMCID: PMC2820744
  9. Ascorbate ameliorates Echis coloratus venom-induced oxidative stress in human fibroblasts
    Yazeed A. Al-Sheikh, Hazem K. Ghneim, Feda S. Aljaser, Mourad A.M. Aboul-Soud
    Exp Ther Med. 2017 Jul; 14(1): 703?713. Published online 2017 May 30. doi: 10.3892/etm.2017.4522
    PMCID: PMC5488744

Then go back to PubMed, and click on “Health” at the bottom (which includes “All Databases” and “Search NCBI”), and search for “klenner vitamin c”. This shows a list of the 4 PubMed articles, the 9 PMC articles, and 3 NLM catalog listings (all the same one). The Europe PMC search for “klenner vitamin c” produces 14 articles, many of which are the same as those in the PMC search.

Click instead on “PubChem” and search for “klenner vitamin c”, and one gets 6 hits, 4 of which are FR Klenner, and 2 are other Klenners. You can readily do a similar search on google.com (9 results), or on duckduckgo.com (more than 30 results).

Google Scholar

With the same search phrase “klenner vitamin c”, Google Scholar gives 923 hits, with a mix of articles, books, and online pages. [7] A more specific search phrase “klenner 1949” gives citations to FR Klenner’s 2 published articles of 1949, taken from Europe PMC. Google Scholar is very powerful but needs specific phrases in order to limit the results to a manageable number. It uses citation counts (i.e. how many other articles refer to an article in their Reference section) to assign a weight to the articles it lists which affects the order in which they are displayed. However, with its tremendous search base, it and other search engines available online have tended to make obsolete many other databases more limited in scope. [8]

Another author example

As another example, a search for “Pauling L” gets 229 hits on PubMed . whereas the same search on PMC gets only 123 hits. Note that author initials are important to select a specific author. The initials are included in the search phrase after the author’s last name. A search for “Pauling” on PubMed gets 1634 hits, but on PMC it gets 5312 hits! The reason for the extra hits on PMC is evidently that a search for an author’s name without initials uses the alternate search terms that include the main text and the References section. There are many perturbations of this effect, so depending on exactly what you are looking for, it may be helpful to experiment with different search word phrases. With “Pauling” Google Scholar gives ~159,000 hits, and with “Linus Pauling”, ~27,600 hits.

Exact phrases

To narrow a search to those articles that include an exact specific phrase, instead of those that include some or all of the words in the phrase, you can enter the phrase inside double quotes. For example, [[“klenner vitamin” c]], would search for “klenner vitamin” and also “c”. This returns no articles in PMC or Europe PMC, and PubMed finds no articles but then defaults to removing the double quotes to find the same 4 articles as it did without the quotes.

General or specific topics?

It’s very easy to use online databases — you only need to enter a search phrase. But how the database server responds can vary widely, depending on the data searched by the database, the search terms it uses for the search, and the “display options.” You can set the order of results according to the “best match” or according to the date — and though these give the same results, the results shown on the first several pages will likely vary, because the articles that “best match” your search terms may not be the most recent.

An example

Let’s search for “low carbohydrate” in PubMed. That yields some 178,000 articles, and in PMC it gives ~370,000. This is likely for the same reason mentioned above: the search terms for PMC include the entire full text and References section. When the search is limited to a seemingly more specific phrase, e.g. “low carbohydrate diet”, PubMed gives a much more restricted ~9,700 articles, and PMC gives ~71,000. In all these cases, the caveat is that the search phrase was not entered inside double quotes. The reason so many articles were given from the above searches is that they include any article that contains the words, “low” and “carbohydrate” anywhere in the article (and in the case of PMC, also including the title of any reference).

When the search phrase is specified exactly, i.e. inside double quotes, [“low carbohydrate”], PubMed returns ~2900 articles, and PMC returns ~8000 articles. Again it appears that PMC returns more articles because its search terms include the full text and references. However, one benefit of the PubMed article listing is that it shows the sentence in which the search phrase occurs, so the user can determine if the match is appropriate. This feature is also provided by Europe PMC and Google Scholar. It is a very powerful feature when a broad search term is specified.

When the search phrase is specified more precisely inside double quotes [“low carbohydrate diet”], Pubmed returns ~1200 articles, and PMC returns ~3600. Although PubMed looks for this exact search phrase in the title, abstract, main text, and keywords, apparently PMC finds more articles because it also includes in its search the titles from its References. Google Scholar finds ~27,000 articles, including those listed in PubMed and PMC searches.

Intelligence or defaults?

The bottom line here is that database search engines do not have what one would call intelligence — they have some default settings, some equivalent phrases (e.g. “vitamin C” = “ascorbic acid”), many options for search terms (in “Advanced”), and several display options. They may also have default exclusions that prevent articles on controversial topics or authors from being listed. If you are searching for a general topic, you may get a huge number of matches which are mostly not appropriate to your intentions. If you want to search for a more specific topic, it helps to include your search terms inside double quotes so the search phrase is specified exactly.

Conclusion

The takeaway point is that each database comprises different categories of articles, and each database uses different default search terms and search methods. Certainly PubMed doesn’t contain and can’t search through all the articles from all medical fields. It doesn’t include certain journals (e.g. JOM). [9] To get the most from an online search, it helps to know what type of articles the databases contain and how the searches are performed.

PubMed and PMC if used wisely are very powerful, but should not be used as an “encyclopedia.” If you use them informally to search general topics, you may miss important articles, both recent and classic. The searches these services perform are based on matches between the search phrase and a selection of the content of each article, defined by the search terms applied by the database — all of which vary depending on the database. In contrast, Google Scholar uses different search criteria, and searches PubMed, PMC, and scholarly articles and books from entire online internet. While extremely powerful, its greater search space and greater number of listed articles emphasizes the problem of determining relevance.

References:

1. PubMed: https://pubmed.ncbi.nlm.nih.gov

2. Legacy PubMed (available until 2020-09-30): https://pmlegacy.ncbi.nlm.nih.gov

3. PubMed Central: https://www.ncbi.nlm.nih.gov/pmc

4. Europe PMC: https://europepmc.org

5. Google: https://www.google.com

6. DuckDuckGo: https://duckduckgo.com

7. Google Scholar: https://scholar.google.com

8. Saul AW, Hickey S. (2007) Medical Obsolescence. http://www.doctoryourself.com/obsolescence.html

9. Journal of Orthomolecular Medicine. http://orthomolecular.org/library/jom

______________________________________

Dr. Robert G. Smith is Associate Research Professor of Neuroscience at the University of Pennsylvania Perelman School of Medicine and is Associate Editor of the Orthomolecular Medicine News Service. He is the author of The Vitamin Cure for Eye Diseases and coauthor of The Vitamin Cure for Arthritis.

The peer-reviewed Orthomolecular Medicine News Service is a non-profit and non-commercial informational resource. Orthomolecular medicine uses safe, effective nutritional therapy to fight illness. For more information: http://www.orthomolecular.org. Comments and media contact: drsaul@doctoryourself.com

Go to Original – orthomolecular.org


Tags: , , , , , , , , , , , , , , , , , , , , , , , ,

Share this article:


DISCLAIMER: The statements, views and opinions expressed in pieces republished here are solely those of the authors and do not necessarily represent those of TMS. In accordance with title 17 U.S.C. section 107, this material is distributed without profit to those who have expressed a prior interest in receiving the included information for research and educational purposes. TMS has no affiliation whatsoever with the originator of this article nor is TMS endorsed or sponsored by the originator. “GO TO ORIGINAL” links are provided as a convenience to our readers and allow for verification of authenticity. However, as originating pages are often updated by their originating host sites, the versions posted may not match the versions our readers view when clicking the “GO TO ORIGINAL” links. This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available in our efforts to advance understanding of environmental, political, human rights, economic, democracy, scientific, and social justice issues, etc. We believe this constitutes a ‘fair use’ of any such copyrighted material as provided for in section 107 of the US Copyright Law. In accordance with Title 17 U.S.C. Section 107, the material on this site is distributed without profit to those who have expressed a prior interest in receiving the included information for research and educational purposes. For more information go to: http://www.law.cornell.edu/uscode/17/107.shtml. If you wish to use copyrighted material from this site for purposes of your own that go beyond ‘fair use’, you must obtain permission from the copyright owner.

Comments are closed.