Certgraph is a tool I've been developing to scan and graph the network of SSL certificate alternative names. It can be used to find other domains that belong to an organization that may be several orders removed and not always obvious.
The idea for this project came about after examining the SSL certificate for XKCD.com. If you look closely at the screenshot below you will see that the SSL certificate used on XKCD.com is also valid for many of domains which have no relationship to XKCD or Randall Munroe.
This behavior is a side-effect of the CDN used by XKCD, in this case, Fastly. Fastly is putting many of their clients on the same certificate, likely in an effort to simplify their deployment. This works because SSL certificates can use the "Certificate Subject Alternative Name" extension to add a list of additional hosts that the certificate should be valid for in addition to the primary name specified in the certificate.
There can also be many certificates issued for a single domain. This creates a many-to-many relationship between certificates and domains; An ideal target to graph.
Certificate Transparency logs provide an additional and excellent of source of SSL certificates to query. Instead of connecting to each host to get its certificate, we can get them all from a single log or index.
Certificate Transparency Background
Flaws in the current system of digital certificate management were made evident by high-profile security and privacy breaches caused by fraudulent certificates being issued by "trusted" CAs.
The goal of Certificate Transparency is to provide an open auditing and monitoring system that lets any domain owner or certificate authority determine whether their certificates have been mistakenly issued or maliciously used.
Certificate Transparency allows domain owners to be notified whenever a certificate is issued for their domain informing them of any unauthorized certificates that may exist. In near real time!
The actual Certificate Transparency logs are hundreds of gigabytes in size and not indexed, so searching them is not ideal, however, there are public Certificate Transparency search engines that index all of the data for us so we can query it.
Unfortunately Facebook's tool requires you to be logged into a Facebook account to use it. But it does send you instant notifications whenever it detects a new certificate issued for a domain you are monitoring.
There are lots of subdomain enumeration tools already (for example Sublist3r), but they all work by either brute-forcing domains, or by searching indexes like Google. CertGraph can also help enumerate subdomains, but much faster and with much more accuracy. This is because every domain CertGraph encounters is known to be a valid domain. Although CertGraph may not encounter every subdomain, it should have no false-positives.
Note: for best results use a Certificate Transparency driver with the
Internal Domain leakage
CertGraph can help enumerate all domains that you may not know are publicly listed inside your certificate alt-names. This can sometimes lead to certificates that are valid for both internal and external hosts being guessed externally, sharing the internal host names with the public. Below is an example of this.
$ certgraph -driver crtsh -ct-subdomains netflix.com | grep internal staging-npp-internal.netflix.com issues-internal.nrd.netflix.net dev-npp-internal.netflix.com npp-internal.netflix.com api-internal.test.netflix.com api-int-internal.netflix.com api-int-internal.test.netflix.com ...
The graph visualization that CertGraph can output can be thought of as a trust graph. If a certificate is valid for one domain, which is being hosted with a cert for another domain, we can say that the second domain must trust the owner of the first domain, as they have the certificate for it. This idea can be expanded and chained to incorporate many certificates. If your graph includes many domains which should not have any trust relationship with you, this may indicate a problem.
Below is a real example of this, with the domains changed to red, green, and blue to make it more obvious and to protect the guilty.
$ certgraph -json blue.com > data.json
In this example Red, Green, and Blue are all different organizations, and we run certgraph on only
blue.com, which enumerated a handful of Blue's other domains and certificates. However, somehow certgraph ended up reaching
green.com and then
red.com. How is this possible? It turns out
blue.com was serving a SSL certificate for
green.com in an alt-name and
www.green.com had an ssl certificate for
After some digging I learned that Blue previously owned
green.com and sold it to Red. But Blue still possesses a valid SSL certificate for
green.com and is serving it from
blue.com. At this point Blue can SSL man-in-the-middle Red's
green.com domain. Take a minute to let that sink in.
If any of the crawled domains use a CDN, CertGraph will skip the CDN certificate by default. CDN certificates may contain hundreds of unrelated alt-names introducing lots of unwanted noise into the data. The
-cdn flag causes CertGraph to include CDN results in the search instead of skipping them.
How CertGraph Works
Currently CertGraph supports 4 different drivers, which are the way CertGraph searches and acquires certificates for domains.
- http this connects to domains on port 443 and collects the certificate from the TLS handshake.
- smtp like
http, but looks up the MX records of the domains and uses port 25 with
- crtsh searches Certificate Transparency using crt.sh
- google like
crtshbut uses Google's Certificate Transparency search tool
Under the hood, CertGraph is rather simple. It uses a modified Breadth-first search to allow it to crawl the graph while it is being created, and in parallel.
Wildcard domains are normalized to their parent domain. Unfortunately this is required because we do not know which subdomain host to connect to. Example:
*.example.com → example.com
Certgraph also has a few output modes:
- Domain list - list all the domains found, 1 per line as they are found (default)
- Domain adjacency list - prints more details such as host status, domain's certificate hash, and depth from root in graph (
- JSON Graph - JSON output for graphing on the Web UI (
- Save Certificates - Save the certificates in PEM format for later analysis (
$ ./certgraph -list eff.org eff.org staging.eff.org leez-dev-supporters.eff.org micah-dev2-supporters.eff.org maps.eff.org web6.eff.org https-everywhere-atlas.eff.org s.eff.org max-dev-supporters.eff.org httpse-atlas.eff.org kittens.eff.org dev.eff.org max-dev-www.eff.org atlas.eff.org
The domain adjacency list is printed in the follwing format:
Node Depth Status Cert-Fingerprint [Edge1 Edge2 ... EdgeN]
$ ./certgraph -details eff.org eff.org 0 Good 5C699512FD8763FC50A105A14DB2526A10AE6EAC3E79F5F44A7F99E90189FBE5 [maps.eff.org web6.eff.org eff.org atlas.eff.org https-everywhere-atlas.eff.org httpse-atlas.eff.org kittens.eff.org] web6.eff.org 1 Good AF842FA69A720E9FB2F37BAF723A20F80B8C2072693E55D0A1EA78C7BABE2699 [*.eff.org *.dev.eff.org *.s.eff.org *.staging.eff.org] https-everywhere-atlas.eff.org 1 Good 5C699512FD8763FC50A105A14DB2526A10AE6EAC3E79F5F44A7F99E90189FBE5 [kittens.eff.org maps.eff.org web6.eff.org eff.org atlas.eff.org https-everywhere-atlas.eff.org httpse-atlas.eff.org] maps.eff.org 1 Good 5C699512FD8763FC50A105A14DB2526A10AE6EAC3E79F5F44A7F99E90189FBE5 [maps.eff.org web6.eff.org eff.org atlas.eff.org https-everywhere-atlas.eff.org httpse-atlas.eff.org kittens.eff.org] atlas.eff.org 1 Good 5C699512FD8763FC50A105A14DB2526A10AE6EAC3E79F5F44A7F99E90189FBE5 [eff.org atlas.eff.org https-everywhere-atlas.eff.org httpse-atlas.eff.org kittens.eff.org maps.eff.org web6.eff.org] httpse-atlas.eff.org 1 Good 5C699512FD8763FC50A105A14DB2526A10AE6EAC3E79F5F44A7F99E90189FBE5 [eff.org atlas.eff.org https-everywhere-atlas.eff.org httpse-atlas.eff.org kittens.eff.org maps.eff.org web6.eff.org] kittens.eff.org 1 Good 5C699512FD8763FC50A105A14DB2526A10AE6EAC3E79F5F44A7F99E90189FBE5 [eff.org atlas.eff.org https-everywhere-atlas.eff.org httpse-atlas.eff.org kittens.eff.org maps.eff.org web6.eff.org] dev.eff.org 2 No Host  s.eff.org 2 Good AF842FA69A720E9FB2F37BAF723A20F80B8C2072693E55D0A1EA78C7BABE2699 [*.eff.org *.dev.eff.org *.s.eff.org *.staging.eff.org] staging.eff.org 2 Good AC3933B1B95BA5254F43ADBE5E3E38E539C74456EE2D00493F0B2F38F991D54F [max-dev-supporters.eff.org leez-dev-supporters.eff.org max-dev-www.eff.org micah-dev2-supporters.eff.org staging.eff.org] leez-dev-supporters.eff.org 3 Good AC3933B1B95BA5254F43ADBE5E3E38E539C74456EE2D00493F0B2F38F991D54F [staging.eff.org max-dev-supporters.eff.org leez-dev-supporters.eff.org max-dev-www.eff.org micah-dev2-supporters.eff.org] micah-dev2-supporters.eff.org 3 Good AC3933B1B95BA5254F43ADBE5E3E38E539C74456EE2D00493F0B2F38F991D54F [max-dev-supporters.eff.org leez-dev-supporters.eff.org max-dev-www.eff.org micah-dev2-supporters.eff.org staging.eff.org] max-dev-supporters.eff.org 3 Good AC3933B1B95BA5254F43ADBE5E3E38E539C74456EE2D00493F0B2F38F991D54F [max-dev-supporters.eff.org leez-dev-supporters.eff.org max-dev-www.eff.org micah-dev2-supporters.eff.org staging.eff.org] max-dev-www.eff.org 3 Good AC3933B1B95BA5254F43ADBE5E3E38E539C74456EE2D00493F0B2F38F991D54F [max-dev-www.eff.org micah-dev2-supporters.eff.org staging.eff.org max-dev-supporters.eff.org leez-dev-supporters.eff.org]
CertGraph also includes a simple web interface for easy visualization of the graph. It can be accessed online at https://lanrat.github.io/certgraph or offline in the docs folder in the program source code.
The web UI is a single page web interface that can visualize the graph output when using the
-json flag. It can be run entirely offline.
$ certgraph -json example.com > example-graph.json
You can load your data into the web interface by uploading, pasting, or linking to a JSON file using the data dropdown menu.
For more information and examples, checkout the project README on GitHub