Search Engines

Google

To see the cache in google results, go the Web Archives section in this page.

Google Dorks

- Quotation Marks

Use quotes around the name of your target.

When your quoted search returns too many results, you should add more quoted terms to your search.

- Site Operator

To locate every page that is part of a specific domain.

Some of the pages on a website that the author may consider

"private" may actually be public if he or she ever linked to them from a public page. Once Google has indexed the page, we can view the content using the "site" operator.

Examples:

site:forbes.com "Michael Bazzell”

site:amazon.com “{target name}”

- File Type Operator

Google allows this operator to be shortened to "ext". This can be combined with the "site" operator. Examples:

"Cisco” filetype:ppt

"Cisco Confidential" filetype:ppt

filetype:doc "resume" "target name"

site:irongeek.com filetype:pdf site:irongeek.com filetype:ppt site:irongeek.com filetype:pptx

The following extensions can be indexed:

7Z: Compressed File

BMP: Bitmap Image

DOC: Microsoft Word

DOCX: Microsoft Word

DWF: Autodesk

GIF: Animated Image

HTM: Web Page

HTML: Web Page

JPG: Image

JPEG: Image

KML: Google Earth

KMZ: Google Earth

ODP: OpenOffice Present

ODS: OpenOffice Spreadsheet

ODT: OpenOffice Text

PDF: Adobe Acrobat

PNG: Image

PPT: Microsoft Power Point

PPTX: Microsoft Power Point

RAR: Compressed File

RTF: Rich Text Format

TXT: Text File

XLS: Microsoft Excel

XISX: Microsoft Excel

ZIP: Compressed File

- Hyphen (-)

To exclude some content from appearing within results. Examples:

"Michael Bazzell" -police

"Michael Bazzell" -police -FBI -osint - books -open source -"mr. robot”

- InURL Operator

Operators that will focus only on the data within the URL.

We can use this technique to find File Transfer Protocol (FTIP) servers that allow anonymous connections.

To identify FTP servers that possess PDF files that contain the term OSINT within the file:

inurl:ftp -inurl(http | https) filetype:pdf "osint"

  • inurl:ftp —> Instructs Google to only display addresses that contain "ftp" in the URL.

  • -inurl (http | https) —> Instructs Google to ignore any addresses that contain either http or https in the URL.

Other example: To display only blog posts from inteltechniques.com that exist within a folder titled "blog" (WordPress):

inurl:/blog/ siterinteltechniques.com

inurl:index.php?id=

inurl:main.php Welcome to phpMyAdmin

inurl: axis

- InTitle Operator

Present web pages that have specific content within the title of the page.

intitle:"osint video training"

intitle:pwd

intitle:"site administration:please log in"

intitle:"curriculum vitae" filetype:doc

intitle:"Retina Report" "Confidential Information"

intitle:"Nessus Report" "Confidential Information"

intitle:Axis inurl:"/admin/admin.shtml"

We can also searchg for online folders:

intitle:index.of OSINT

- OR Operator

You may have a target that has a unique last name that is often misspelled. The "OR" (uppercase) operator returns pages that have just A, just B, or both A and B. Example:

"Michael Bazzell” OR "Mike Bazzell”

- Asterisk Operator (*)

Wildcard. For example,

"osint * training” tells Google to find pages containing a phrase that starts with "osint" followed by one or more words, followed by "training".

- Range Operator (..)

To search between two identifiers.

For example, to identify websites that contain information about Bonnie Woodward, a missing person, and between 1 and 999 comments within the page:

"bonnie woodward" "1..999 comments"

It collects a domain, and attempts to provide online content related to that address.

related:inteltechniques.com

- Google Search Tools

Text bar at the top of every Google search, the last option on this bar is the "Tools" link. This provides new filters to help you focus only on the desired results.

Google Programable Search Engines

We may consider creating our own custom search engines:

https://programmablesearchengine.google.com/

- Social Networks Custom Search Engine

First create a new custom search engine by clicking "New search engine in the left menu.

Insert the social media websites to be searched:

  • Facebook.com

  • LinkedIn.com

  • Twitter.com

  • YouTube.com

  • Instagram.com

  • Tumblr.com

After you have added these websites, provided a name, and created your engine, navigate to the control panel option in order to view the configuration of this custom search engine. On the left menu, select "Edit search engine", select your engine and click "Search features", then select a new option at the top of the page labeled "Refinements". Click the "add" button to add a new refinement for each of the websites and add the following:

  • Facebook

  • LinkedIn

  • Twitter

  • YouTube

  • Instagram

  • Tumblr

When each refinement is created, you will have two options, "Give priority to the sites with this label" will place emphasis on matching tules, but will also reach outside of the rule if minimal results are present. "Search only the sites with this label" will force Google to remain within the search request and not disclose other sites. I recommend using the second option for each refinement.

Now that you have the refinements made, you must assign them each to a website, Back on the "Serup" menu option, select each social network website to open the configuration menu. Select the drop-down menu titled "Label" and select the appropriate refinement. Repeat this process for each website and save your progress. Navigate back to “Setup" in the left menu and select the Publie URI link to see the exact address of your new engine. You can now search any term or terms that you want and receive results for only the social networks that you specified.

- Documents Custom Search Engine

Create a new custom search engine and title it Documents". Add only "google.com" as the website to be searched. Save your engine, click "Edit search engine" , and then click "Setup". In the "Sites to search" portion, enable the

"Search the entire web" toggle. Delete google.com from the sites to be searched. It will essentially do the same thing as Google's home page.

Now add refinements to filter your search results. Add a new refinement "PDF"; change the default setting to "Give priority to the sites with this label; and enter the following in the "Optional word(s)" field:

  • ext:pdf

This will create a refinement that will allow you to isolate only PDF documents within any search that you conduct. Save this setting and create a new refinement. Title it DOC; change the default search setting; and place the following in the "Optional word(s)" field:

  • ext:doc OR ext:docx

Repeat this process for each of the following document types with the following language for each type:

  • ext:xls OR ext:xlsx OR ext:csv

  • ext:ppt OR ext:pptx

  • ext:txt OR ext:rtf

  • ext:wpd

  • ext:odt OR ext:ods OR ext:odp

  • ext:izip OR ext:rar OR ext:7z

- Other Custom Search Engine Examples

We could make an engine that isolated images with extensions such as jpg, jpeg, png, bmp, gif, etc. We could also replicate all of this into a custom engine that only searched a specific website. If you were monitoring threats against your company, you could isolate only these files that appear on one or more of your company's domains.

Google Alerts

google.com/alerts

When you have exhausted the search options on search engines looking for a target, you will want to know if new content is posted.

Utilizing Google Alerts will put Google to work on locating new information.

Additional Google Engines

The following engines will likely give you results that you will not find during a standard Google or Bing search.

- Google Blogs (google.com)

Load the "Blogs" option under the "News" menu within the Tools" option on any Google News results

page. Alternatively, you can navigate to the following address, replacing TEST with your search terms.

google.com/search?q=TEST&tbm=nws&tbs=nrtb

- Google Patents (google.com/tbm= pts)

The best patent search option on the internet. It allows you to search the entire patent database within any field of a patent. Advanced patent search at google.com/advanced_patent_search.

- Google Scholar (scholar.google.com)

Indexes the full text of scholarly literature across an array of publishing formats.

We can locate many court records through this free website that will cost money to obtain from private services.

- Keyword Tool (keywordtool.io)

Google quickly offers suggestions as you type in your search.

Google only provides the five most popular entries. Keyword Tool provides the ten most populat entries. Additionally, you can choose different countries to isolate popular terms. You can also see results from similar searches that Google does not display.

Bing

The site operator and the use of quotes both work with Bing exactly as they do with Google.

To see the cache in google results, go the Web Archives section in this page.

Bing offers an option that will list every website to which a target website links.

This operator creates a result that includes every website to which I have a link, located on any of the pages within my website. Example:

LinkFromDomain:inteltechniques. com

Bing Contains

The "contains" operator allows you to expand the parameters of the file type search.

Example:

contains:ppt site:cisco.com

These include PowerPoint files that are linked from pages on the domain of cisco.com, even if they are stored on other domains. This could include a page on cisco.com that links to a Power Point file on hp.com.

International Search Engines

In Russia, Yandex is the chosen. In China, most people use Baidu.

- Yandex (yandex.com)

It has similar operators to google.

Specific operators:

  • ampersand (&) —> indicate that you want to search for multiple terms.

  • && —> pages that have both those words within the same page, but not necessarily the same sentence.

  • (+)—> Michael + Bazzell would mandate that the page has the word Bazzell, but not necessarily Michael.

  • (|) —> same as “OR”

  • (~) —> To exclude a word use this instead of (-)

  • (!) —> "!Carina!Abad!Abad" identify any results that included those three words regardless of spacing or punctuation.

  • Date specific searches:

date:20111201..20111231 OSINT —> Websites mentioning OSINT between December 1-31, 2011

date:2011* OSINT --> Websites mentioning OSINT in the year 2011

date:201112* OSINT —> Websites mentioning OSINT in December of 2011

date:> 20111201 OSINT --> Websites mentioning OSINT after December 1, 2011

- Baidu http: //www.baidu.com/s?wd-osint

I Search From

isearchfrom.com

To search Google within a version specified for another country, this site simplifies the process.

Web Archives

Occasionally, you will try to access a site and the information you are looking for is no longer there. Maybe something was removed, amended, or maybe the whole page was permanently removed. Web archives, or "caches".

Historical copies of websites are one of the most vital resources when conducting any type of online research.

We should use manual approach. Engines such as Bing and Yandex generate a unique code when a cache is displayed. This action prevents most automated search tools from collecting archived information.

Any time that I find a website, profile, or blog of interest, I immediately look at caches hoping to identify changes in content. These minor alterations can be very important.

- Google Cache (google.com)

When conducting a Google search, you will see a green down arrow that will present a menu when clicked. This menu will include a link titled "Cached", Clicking it will load a version of the page of interest from a previous date.

To be navigated directly to the cached page.

cache:www.phonelosers.org/snowplowshow

- Bing Cache (bing.com)

Similar to Google.

- Yandex Cache (yandex.com)

Yandex presents a green drop-down menu directly under the title of the search result, this displays their cache menu option.

The biggest strength of the Yandex cache is the lack of updates. An older cache can be very helpful in an investigation.

- Baidu Cache (baidu.com)

Look for a word in Chinese directly to the right of the link. Clicking this link will open a new cab with the cache result.

- The Wayback Machine (archive.org/web/web.php)

This will provide a much more extensive list of options for viewing a website historically.

To conduct a search via a direct URL:

https://web.archive.org/web/*/Michael Bazzell

The results identify over twenty websites that include these terms. Within those sites are dozens of archived copies of each.

Non-English Results

- 2Lingual (2lingual.com)

Conduct one search across two country sites on Google. This will display a plain search box and choices of two countries. Foreign results will be automatically translated to English.

- Google Translator (translate.google.com)

We can translate an entire website.

Instead of copying individual text to the search box, type or paste in the exact URL (address) of the website you want translated.

This will also work on social network sites.

- Bing Translator (bing.com/translator)

Can also type or paste an entire foreign website to conduct a translation.

- DeepL (deepl.com/translator)

The most accurate translator. Appears and functions identical to the previous options.

- PROMT Online Translator (online-translator.com)

It allows translation of entire websites similar to Google and Bing.

- Google Input Tools (google.com/inputtools/ try)

It allows you to type in any language you choose.

This technique is extremely important when you have located a username in a foreign language. As with all computer-generated translation services, the results are never absolutely accurate.

Newspaper resources

- Google News Archive (news.google.com)

Google's News Archive is continually adding content from both online archives and digitized content from their News Archive Pattner Program. Sources include newspapers from large cities, small towns, and anything in between. It will allow for a detailed search of a target's name with filters including dates, language, and specific publication.

To display the menu, click on the down arrow to the right of the search box. This can quickly identify some history of a target such as previous living locations, family members through obituaries, and associates through events, awards, or organizations.

- Google Newspaper Archive (news google.com/newspapers)

The previous option focused solely on digital content. Google's.

Newspaper archive possesses content from printed newspapers.

- Newspaper Archive newspaperarchive.com)

This paid service provides the world's largest collection of newspaper archives.

Example:

site:newspaperarchive.com 'This archive is hosted by" "create free account"

This isolates only the

newspaper collections that are available for free and without a credit card.

siternewspaperarchive.com "This archive is hosted by” “edar rapids gazente"

- Old Fulton (fultonhistory.com/Fulton.html)

- Library of Congress US News Difectory (chroniclingamerica.loc.gov)

- Library of Congress US News Directory (chroniclingamerica.loc.gov/search/titles)

- Small Town Newspapers (stparchive.com)

Specialized Search Engines

- Searx (searx.be)

Conducting a search will provide results from the main search engines, but will remove duplicate entries.

Repeat this redundancy-reducing option by checking results on Images, News, and Videos sections. Next to each result on any search page is a "cached" link.

Finally, a "proxied" option next to each result will connect you to the target website through a proxy service provided by Searx.

The "Links" section to the right ofall search pages disphys options to downlond a esv, ison, or rss file of the results. The cay option is a simple spreadsheet that possesses all of the search results with descriptions and direct links. This is helpful when you have many searches to conduct in a short amount of time, and do not have the ability to analyze the results until later.

This is good in two areas.

Works well in finding documents that include the target mentioned within the document.

Vosalead, an Exalead search engine, searches within audio and video files for specific words.

- DuckDuckGo (duckduckgo.com)

Does not track anything from users. Useful with sensitive investigations.

- Start Page (stastpage.com)

Similar to DuckDuckGo, Start Page is a privacy-focused. The difference here is that Start Page only includes Google results versus DuckDuckGo's collaboration of multiple sources.

Another benefit is the ability to open any result through a "proxy link.

- Qwant (qwant.com)

Owant attempts to combine the results of several types of search engines into one page.

- Million Short (millionshort.com)

Offers a unique function, you can choose to remove results that link to the most popular one million websites. This will eliminate popular results and focus on lesser-known websites. You can also select to remove the top 100,000, 10,000, 1,000, or 100 results.

Tor Search Engines

Non-Tor Search Sites

- Ahmia (ahmia.fi)

While no engine can index and locate every Tor website, this is the most thorough option that I have seen.

Complement Ahmia well for Tor-based investigations.

- Onionland Search (onionlandsearchengine.com)

This service relies on Google's indexing of Tor sites which possess URL proxy links.

Whenever you see an onion link you can replace "onion" within the address to "onion.ly" in order to view the content.

Tor Search Sites

The strongest Tor search engines exist only on the Tor network.

- Torch

http://torch4st415712u2vr5wqwwwyueucvnrao4xajqr2klmcmicrv7ccaad.onion

- Haystack

http://haystal5nismn2hqkewecpaxetahtwhsbsa64jom2k2275afxhnpxfid.onion

Search Engine Collections

- Search Engine Colossus (searchenginecolossus.com)

Index of practically every search engine in every country.

- Fagan Finder (faganfinder.com)

Hundreds of options. Many of the search services are targeted toward niche uses, but you may find something valuable there.

Manual method of searching Google for FTP information.

To look for any files. including the term "confidential":

inurl:fp -inurl (hitp | https) "confidential"

inurl:fip -inurl (http | https) "cisco" filetype:pdf

- Napalm FTP (searchftps.org)

FTP search engine. Often provides content that is very recent.

- Mamoht (mmnt.ru)

The favorite feature of this engine is the "Search within results" option.

- FreewareWeb (freewareweb.com/ftpsearch.shtml)

Less robust than previous options, but should not be ignored.

- Nerdy Data (nerdydata.com/reports/new)

Nerdy Data searches the programming code of a website.

Perform a search of the code of concern and identify websites that possess the data within their own source code.

IntelTechniques Search Engines Tool

Perform automated searches through this tool, code in search.html (IntelTechniques Custom Tools Collection)

Last updated