Getting Started With Basic Google Searches
Hello and welcome. My name is John Strand and in this video, we’re going to be talking about some very basic Google searches.
Now we’ve got to take a couple of steps back and talk about what Google actually does. Google goes through and it indexes all the different texts and images and things they can find on the internet. Once I had somebody describe Google’s entire business model is just creating a reverse index of the internet, and that may or may not be true, but the point is, it’s an incredibly powerful tool for security professionals to potentially identify weaknesses in their security architecture that Google has indexed.
So I’m going to just show you just a couple of very, very, very basic Google searches that you can use in a variety of capture the flag scenarios and against your own site to try to find some vulnerabilities.
So I’m going to start with some basic start searches that you can work with. One of the most heavily used ones is site:. The reason why we use site: is you can use site: and have your specific search focus in like a laser beam on just your domain. And usually, whenever I’m doing this I’m looking for something or I’m actually kind of… I’d like to think of it as like panning for gold. I’m sifting out all the things I don’t care about to try to get down to something interesting.
Now just to be clear, I’m not trying to hack any sites with this particular demo, but I’m showing you how you can identify vulnerabilities on your own infrastructure fairly quickly and fairly easily. So whenever I do a site:… And let’s say I put in Microsoft. I put in Microsoft.com. That’s going to restrict all of my searches to just Microsoft.com. So I can do site: Microsoft.com and do cats and we’ll see if any website at Microsoft has anything from cats.
And here we go. It says, all right, “Download Kaggle Cats and Dogs dataset from Microsoft,” cats at Microsoft stories. So you can see we restricted our search to just that. And that’s pretty cool, especially whenever you’re looking for files.
So you can look for things like doc or you can look for like docx or ppt or find any number of different file extensions. Usually, whenever I’m doing a search though on a site, what I’m trying to do is sift through things that are easily identified with Google to try to find the lesser-known things that Google has indexed.
So what I’ll do is I’ll do a site Microsoft.com or site: Microsoft.com. So let me put this in properly. So we’ve got site Microsoft.com. And now, what we can actually do is we can now start excluding things that I already know exist. So I could do -www because I don’t care about www.Microsoft.com. And I can do -docs like that and it’s going to exclude Microsoft docs. Here we got, what is it, go.Microsoft.com. It just says it’s a Microsoft site. I might find that interesting and I’ll throw it over. Once again, I’m not expecting to find anything like super, super interesting. We’re not trying to do that at all, but I’m showing how you can exclude things, so I’m to do -go, go docs, and let’s do -tech community. I want to remove that and then we’ll do -support.
So if you look up here at the top, you can see that we’re kind of building a list of all the different sites that exist at Microsoft. This may not sound all that interesting. However, whenever you’re looking at this from a security practitioner’s perspective, it becomes incredibly important, because there may be parts of your infrastructure that Google has indexed that you’re exposing that you never expected to expose ever under any circumstances at all.
For example, you may have alternative VPN login portals, you may have remote administration pages for various websites, firewall administration pages, all kinds of different login pages, Tandberg devices, Polycom devices. All of these things will eventually show up as you start sifting through a website and all of the different parameters that can exist on that.
In fact, whenever I’m working with IANS, it’s not uncommon for me when I’m talking on the phone for expert decision support, where I’m typing this in while I’m talking to a customer, and so far twice in the past three years, while I’ve done that on the phone with a customer, kind of a habit that I have, I have found completely exposed interfaces.
For example, I was able to find a full video camera interface for their security cameras for one of our customers. I was able to find a page without authentication that allows you to manage and edit and work with the certificates for TLS SSL on their websites. So this is pretty heavy stuff, and it just involves a little bit of curiosity and digging in.
Some of my other ones that I like working with whenever I’m working with sites is I can work within title index of. Now the vast majority of what you find if you work with this particular Google search isn’t all that interesting, but it does at least show how this can be useful. You see, if somebody is enabled indexable directories on their website, it does just that. It’s an index of the directory structure for the webserver and many times this will allow you to identify various directories, pull down source code for pages… And by the way, the source code is completely different than what you see when you do view source. You may find things like user IDs and passwords for backend database connections.
So this is one of my all-time favorites working with “index of” and here’s just a couple of examples from developer Apple.com and here’s Apache software foundation distribution directory. Now, once again, I’m not trying to hack anything and show you, “Oh, this is how you hack a site,” but what happens is you’ll see something very similar to this.
If you have a vulnerable website in your organization, you’ll have a web server and it’ll list out all the directories for that web server and then you’re able to go into those various directories and you’re able to see various files. Now that may not sound interesting, but once again, if you start getting the things like source code from indexable directories or documents with metadata, it starts getting very interesting, very, very quickly.
So this is just a basic Google search primer and these are the things that I do as table stakes anytime we’re working on a pentest because these things many times will turn up vulnerabilities that your standard scanner may not turn up and set as critical.
So be sure to check it out on your own organization. So you’ll go through and just do site:, your domain, and then you’ll start doing minus the pages that you see show up. If you get all the way down to nothing, congratulations, and nothing surprises you, that’s great. But if you start seeing things like weird authentication portals, default web pages, things of that nature, you’re going to want to clean those up.
So thank you so much for joining this particular edition of Getting Started with Black Hills Information Security. And I hope to see you in another video.
Don’t forget to check us out every Wednesday where we do Enterprise Security Weekly. So thanks again and I can’t wait to see you in another video.
Want to level up your skills and learn more straight from John himself?
You can check out his classes below!
Active Defense & Cyber Deception
Available live/virtual and on-demand