Frequently Asked Questions

If you have any questions about our service, this is a good place to look for answers.

Data collection

How do you know which technologies are used by a site?

Primarily, we use information provided by the site itself when downloading web pages. In other words, we fetch web pages very much like a search engine, and analyze the results. Additionally, we use publicly available information from sources such as Tranco, Google, Microsoft and ipinfo.io.

How exactly does your website analyzer work?

We search for specific patterns in the web pages that identify the usage of technologies, similarly to the way a virus scanner searches for patterns in a file to identify viruses. We use a combination of regular expressions and DOM traversal for this search. We have identified several thousand indicators for technology usage. These indicators have different priorities, and based on the presence or absence of specific combinations of indicators in a specific context, we come to our conclusions.

These are examples of the information used by the indicators:

HTML elements of web pages
Specific HTML tags, for example the generator meta tag
JavaScript code
CSS code
The URL structure of a site
HTTP headers, for example cookies
HTTP responses to specific requests, for example compression
DNS records
Whois information

Additionally, we exploit dependencies between technologies. For example, if we find a WordPress site, we know that it is using PHP. A fair share of our data is based on such dependencies.

A lot of research was necessary to build the analyzer, and we keep improving it all the time. We want it to be the best possible website analyzer.

How accurate is your information?

It is impossible for this type of surveys to be 100% accurate, since websites can choose to hide most of their technologies, if they want to. See also our disclaimer for some more information. There is no way to be absolutely sure not to get some errors in the technology identification. We try to find ways to balance the false-positives and the false-negatives (after eliminating as many as possible), and we try to make sure that none of the remaining errors are clustering on one technology rather than another.
Our goal is to provide the most accurate and reliable web technology surveys, so that the answer to this question would be: it is as accurate as one can possibly get. We believe that we are not too far away from that goal.

How often do you visit a site?

That depends on a number of factors, but approximately once a month, some sites less often.

Do you analyze only the home page or also inner pages and subdomains?

In most cases we crawl deeper, visiting a few sample pages.

Reports

How often do you update the reports?

All reports on our website are updated daily. Although we don't analyze every site every day (see above), we permanently add new information into our database, and we want new trends to be visible as quickly as possible. The much more extensive technology market reports are generated monthly.

Which websites do you count? Do you crawl all the web?

For the surveys, we count what we call the relevant web, see our technology overview for more explanations. We do crawl more sites, but we are convinced that our statistics would become less useful and less relevant by including all the typical "throw-away" sites or parked domains or other types of spam sites.

In some of the market share reports, the figures don't add up to 100%. How come?

That is the case when websites use more than one of the technologies, for example websites may use more than one server-side programming language. We could do the calculations differently, but then a usage of 50% would not necessarily mean that the technology is used by every second site, which we would find quite confusing.

Why are your figures sometimes very different to figures published somewhere else?

The biggest source of confusion comes from the fact that we measure technologies used for websites, whereas other surveys measure something else. For example the well known Tiobe Index measures overall popularity of programming languages. C is more popular than PHP in this report, but C is very rarely used to build websites. Another example is Distrowatch, which measures popularity of Linux distributions, but that includes popularity of desktop installations. Therefore their ranking is different to ours.

Other figures published on the usage of web technologies often are based on different samples. For example they may use very small random samples, or samples favoring specific geographical regions, or they may use only a small fraction of the web say the top 10.000 sites, or they may include subdomains or even individual web pages in their counts, or they may even be based on polls of their website visitors. If there are no such differences in the measurement techniques, then there are certainly still differences in the website analyzing methods. We know for sure that a lot of research has been done to develop our analyzing methods, we are not so sure about others.

Advanced Reports

What are these breakdown and segmentation reports in the navigation bar?

In the breakdown reports, you can see the usage of combinations of technologies, e.g. which Javascript libraries are used together with which content management systems. This is an example of an overview breakdown report, showing the most popular technologies of two categories.

If you want more details, you have to navigate to a specific technology, e.g. Wordpress, and then click on Javascript Libraries under the Breakdown menu.

Within the Wordpress report, if you click on Javascript Libraries under the Segmentation menu you get a similar report, showing the distribution of Javascript libraries among all the websites that use Wordpress as content management system. You can switch between breakdown report and segmentation report by clicking on the Related Reports menu entry.

Breakdown and segmentation reports are very powerful analysis tools. You probably have to play around a bit to explore all the possibilities and to find your way through the navigation to the reports you want. Use this as an example: if you want to know which web server technologies are used in Kyrgyzstan, then navigate from the Technologies overview to the Top Level Domain report. Then scroll all the way down to .kg for Kyrgyzstan (or use Ctrl-F in your browser to find it quickly) and click on it. Next click on Web Servers under the Segmentation menu you see the report you wanted.

Please be aware that some technologies have a very low representation in our sample. Breakdown and segmentation reports may have a high statistical variance in these cases, in other words the figures may be unreliable. For instance, we know of only one site, that uses Neapolitan (Wikipedia). Don't expect any useful statistics from such a data set.

Any other questions

If you have more questions, please feel free to post them in the forums or send them to us directly, if you prefer.

About Us Disclaimer Terms of Use Privacy Policy Advertising Contact

W3Techs on

Mastodon

Bluesky