The Internet, which is familiar to us (the tip of the “Internet iceberg”), is only a small part of the entire Web space. It is called “TheSurfaceWeb” – “The Internet on the surface”, or ClearNet. Sites in this segment are indexed by search engines, and their share is approximately 4%.
The remaining 96% is the invisible part, or “DeepWeb” – the “deep” Internet. This includes data belonging to companies, government agencies, scientific and medical centers, the military, etc. The information in this segment is not indexed. The lower part of the “Internet iceberg” is the “DarkNet” – the “shadow” Internet, which is accessed using specialized software.
The history of the DarkNet dates back to the early 1970s, when the prototype of the future Internet was being developed, and until the 2000s almost nothing was known about it. In 2004, a special server anonymization system was launched to hide the true location of the user. Currently, access to these servers is carried out using a special Tor network client (Tog Browser). Data exchange in this network takes place in encrypted form and anonymously, which allows you to transmit information without fear of its interception and guarantees attackers to carry out their activities with impunity. For this reason, DarkNet has become a popular platform for trading various hacking tools and cyber attacks, as well as illegally obtained information, including personal data of customers.
In recent years, the activity of cybercriminals has significantly increased: the appearance of “black markets” on the DarkNet has contributed to the spread of malware, as well as tools and methods to circumvent fraud monitoring. But does a novice hacker know how to use all this and at the same time act anonymously? Alas, he knows: if in the past he needed many years of experience to commit a cyberattack, now many tools are available in the form of step-by-step guides, and for a small fee!
Thus, research on pricing on the “black market” of DarkNet shows that a significant set of tools for cyber attacks (malware, ready-made phishing pages, password crackers, etc.) can be purchased for just a few dollars. A team of researchers studied thousands of ads in the five largest DarkNet “black markets” and compiled the Dark Web Market Price Index. The results show that the cost of carrying out a cyberattack has become much more affordable, and the barriers to entry that existed before have practically disappeared.
Another significant factor for the development of the “black market” was the emergence of cryptocurrencies. Digital coins, especially BitCoin, are very popular in many services of the “shadow” Internet. They act as the main payment instrument, as buyers and sellers strive to achieve maximum anonymity. The emergence of cryptocurrencies has given a new breath to the “black markets” in the DarkNet.
What does all this mean? First of all, you need to understand that it is practically pointless to fight against “black markets” at the moment: two new ones are coming to replace one closed market.
As users, we need to start taking the protection of our personal data seriously. Strong passwords, two-factor authentication, privacy protection services, the use of encrypted VPN networks, etc. will help in this.
As developers, we have to put security above convenience and force users to adhere to high standards of protection, whether they like it or not. Security should become an indisputable advantage and a weighty argument.
As auditors, in the “shadow” segment of the Internet, we must regularly review and search for new cyber threats that could potentially harm the information security of the organization. But do not limit yourself to just one Darknet to search for such information.
At one time, DarkNet visitors used the encrypted Jabber messenger.Over time, it was replaced by a new secure messenger Telegram, which became a safe haven for fans of anonymity. It became possible to create thematic channels, which, of course, contributed to the emergence of channels about DarkNet. But how to analyze the regularly incoming flow of information in these channels?
After conducting research, we found out that it is possible to solve this problem in two ways: by accessing the Telegram API directly or using various aggregates of Telegram channels. The main advantage of the latter is that the selection of popular channels is carried out by the aggregator itself, which greatly facilitates the work. Our choice fell in favor of the aggregator tgstat.ru , containing, unlike similar services, a large number of channels. Using the Python 3 language (django, selenium, pymorphy libraries), as well as NLTK, PostgreSQL, Nginx+Gunicorn+Supervisor, we have created a special tool that periodically filters and transmits messages from various channels and chats into a single database, accessing which we can track trends and trends by topics of interest to us.
In the process of implementation, we faced a situation when similar news content is published in various sources, and we need to identify similarities between these news. To solve this problem, we used NLP techniques.
Using regular expressions, we find only sequences of characters longer than one in the text (this operation also removes prepositions with a length of one character).