You write index php topic powered by smf. Automatic detection of the forum engine

08.03.2020 Advice

Organized by Botmaster Labs, not planned. There is no time, the video is needed for the competition, like a newfangled trend, although it's easier to explain everything with good screenshots (my IMHO), and I don't really want to shoot anything. So there are very few profitable ones left, stupid spam no longer rules at all, here you need to think about and no one will burn the topic, if only the outdated ones try to shove and powder a little in a beautiful wrapper. :) But this is not about us. In general, these 3 "not", I think, basically became the barriers to participation in the competition for the majority of potential participants. It's like repairing a car out of three: cheap, high-quality, fast - the service can only fulfill 2 conditions at the same time. sit and choose what is closer to you. :) The competition is the same: I have time, I can make videos, but I don’t have a topic, or I can make a video, I have a topic, but I don’t have time at all, or I have free time and I have a little temka, but the video is scary. But it's good if 2 conditions are met simultaneously. Well, okay, let's drop the lyrics. I will continue to myself. I didn’t plan, which means that I took part in the competition, I even chose which article I would vote for. Say what you like, but Doz knows very well the software and knows how to use it very sensibly. But today I found out that intrigue appeared in the competition. It turns out I won't be able to vote, but only beginners who purchased the software in 2011 and the competition is designed for them will be able to do this. I was surprised a little, but the owner is a master. The competition is an advertising campaign and Alexander knows better how to conduct it. In general, I decided then to post an article, it is somewhat easier to write when it is clear for whom, for the entire collective farm it is actually impossible to do this.
The long introduction is over, now to the point.
What does a beginner need when he has acquired such a super-harvester, which is the Xrumer + Hrefer complex? That's right, learn how to work on it and discard the illusion that by starting to spam with sheets, you can make money. If you think so, donate your money to charity right away. You need to learn how to use the tools of the complex, preferably by sharpening it for yourself. The time "take more - throw more" is gone. Quantity gives way to quality. So we will collect the base for ourselves, do not learn how to do this - you will lag behind the train. Of course, Hrefer will help us with this. If you plan to promote your resources on Google, then we also need to look for donor sites through Google. I think this is clear and logical. But Google, as the owner of the copper mountain, does not give away its wealth to everyone. You need an approach to it. I would like to say right away that do not hope that you will be able to collect something based on the signs that you find in public. That's why they are available in the public, because they are worthless. I will not develop the topic further. I'd better tell you how to assemble it correctly so that you see the result, you will finish the rest yourself, the main thing is to understand the principle. It is necessary to collect on the correct basis on the basis of specific engines we need, and not on the basis of forums in general. This is the main mistake newbies make - not to concentrate on the specific, but to try to cover everything in its entirety. And yet, if you want to parse a more or less normal base, refuse to use it in operator queries. No "inurl:", "site:", "title", etc. Google will ban searchers like you instantly. Therefore, we carefully study the engines with which Hrumer is currently working:

In version Hrumer 7.07, the program is trained in several new engines:

forumi.biz, forumb.biz, 1forum.biz, 7forum.biz, etc.

phpBB-fr.com, Solaris phpBB theme

And the process of learning new things goes on continuously.
In general, we need to prepare the correct queries for parsing by Hrefer. Let's take a forum djok as an example. SMF Forums... And let's start disassembling it into parts for parsing. Our beloved Google will help us with this. Entering a request into Google SMF Forums- there is a lot of garbage in the search results, we rewind to some 13th page and select any link. I came across this one: http://www.volcanohost.com/forum/index.php?topic=11.0. We open it and examine it. We need to find something characteristic on the page that can be applied to the search for other pages on this engine. In the footer, we notice the following inscription Powered by SMF 1.1.14, we quote it and enter it into Google, it shows us that for this request, it knows about 59 million options. We quickly look through the links, add a couple or two more options to this keyword, for example, "Powered by SMF 1.1.14" poplar or "Powered by SMF 1.1.14" viagra... We are convinced that the request is gorgeous, in the results there are only forums and almost no garbage for you.

In addition, we are not interested in quantity, but in quality, as I said above. Move on. From the same forum, we take another phrase from the footer: , we also quote it and feed it to Google. In response, he reveals that he knows more than 13 million results. Again, we skim through the search results, add additional words and check the results with them. We make sure that the query is great and there is also almost no garbage. In general, there are already 2 iron requests. I suggest leaving the first forum alone for now and continuing to collect requests for other forums. Fortunately, Google is open on request 2006-2008, Simple Machines LLC... We take from the results, for example, these forums: http://www.snowlinks.ru/forum/index.php?topic=1062.0 and http://litputnik.ru/forum/index.php?action=printpage;topic=380.0 in their footers, we take the following queries: "Powered by SMF 1.1.7" and "Powered by SMF 1.1.10" (I always advise to drive queries in quotation marks for Hrefer, because we need quality first of all). I think it is clear what we are doing, in the end we will have a certain base of queries to search for forums on the SMF engine (it is selected for an example, with the rest of the engines it is the same).
It will look something like this:

"Powered by SMF 1.1.2"

"Powered by SMF 1.1.3"

"Powered by SMF 1.1 RC2"

"Powered by SMF 1.1.4"

"Powered by SMF 1.1.8"

"Powered by SMF 1.1.7"

"2006-2008, Simple Machines LLC"

And that's not all. While collecting the versions of the engines, on some SMF forums in the footer, we find the nadvis "2001-2006, Lewis Media". We check this request, it also fully satisfies us. We find a similar query: "2001-2005, Lewis Media". Going through the footers, we find the following query: "SMFone design by A.M.A, ported to SMF 1.1". We check - excellent. Etc. Half an hour of work and you have a wonderful database of queries for the engine, and Google will be banned for these queries much less often than if you use operators in them. And at the same time, your database will be much cleaner than if you use queries like "index.php? Topic =", because here Google will give not only the forums we need, but also a lot of left resources where we succeeded leave a link to the forum topic. You may argue, they say, what's wrong with that? Others left the link, which means we can. But! Links can be left not only by Hrumer, but also by other programs. moreover, they can be specially sharpened to leave comments in a certain resource, the so-called highly specialized software, plus such links could be left by hand. Again, I will repeat that it is not the amount of trash that is important to us, but the quality, the base with the correct requests, and so we will collect. The advantage of this method is that you practically do not need to configure in Hrefer sieve -filter , you can simply turn it off, because Google will practically not give you garbage.

I believe that it is very important to learn how to use Hrefer at the initial stage correctly, because having learned this, you can always find a use for Hrumer, no matter how the situation changes. The protections are becoming more complicated, and if on some types of engines the protection has been strengthened and Hrumer cannot cope with it at the moment, then it makes no sense to spend resources on collecting these links, and then on working on them with Hrumer, it is better to focus on what gives the result ... And at the same time, if the Botmaster Labs team taught Hrumer something new, you can quickly dissect a new patient and prepare a base for Hrumer while the patient is still lukewarm. Time is money, the resource may no longer be relevant when you buy the base. collected by someone. In addition, the correct collection of bases for yourself significantly expands the "white" use of Hrumer. And this is exactly where everything is moving, whether we like it or not, and the process of whitening or graying is going on. Black sheets are a thing of the past.
All the rest, already technical aspects of working with Hrefer, can be viewed in the help and it makes no sense to dwell on them, all goals-points-seconds are set empirically for each car individually.
As a bonus, I will post here a template for parsing the Chinese search engine Baidu, the other day I was asked about it, so I did it in between times, sorry for the pun. :)

Hostname = http: //www.baidu.com
Query = s? Wd =
LinksMask =
TotalPages = 100
NextPage =
NextPage2 =
CaptchaURL =
CaptchaImage =
CaptchaField =

I tried to parse them with a test, there was no ban, Hrefer collected resources lively, all requests for parsing were similar to Google's, but Chinese resources were a sea, and with a high PR, and besides, there were many places where the European's foot did not go. It is better to parse with Chinese requests. This will help Google-translate, type a list of keywords in Russian and translate it into Chinese. Truth in " words"Hrefer words cannot be added in Chinese, you need to recode.
Instead of Chinese:

伟哥 - viagra

吉他 - guitar

其他 - rest

保险公司 - insurance

Put these codes to replace them in the word file:

% E4% BC% 9F% E5% 93% A5

% E5% 90% 89% E4% BB% 96

% E5% 85% B6% E4% BB% 96

% E4% BF% 9D% E9% 99% A9% E5% 85% AC% E5% 8F% B8

If you are promoting a website for insurance, then by placing a link in your profile on a thematic (!) Even Chinese forum found by request " Forum SMF "保险公司 will be very nice.
In conclusion, I would like to say that I have never understood people who complained that the Khrefers are bad or not scribbled, I always wanted to say this, you just do not know how to cook them. Better than a hrefer, no parser knows how to collect results, just the requests must be correct. Hrefer is a car: good, solid, made in German, but a person drives it and it all depends on how smart it is driven, you cannot force the car to go right and left at the same time.
A separate topic is the cleaning of bases, I once did 3 years ago for the previous competition. With more, everything is still relevant there, but now you can refuse to check for 200 OK, I really didn't really like this process, the errors were very large, a lot of unnecessary things were filtered out. Now this can be done almost automatically in the process of Hrumer's work, although this process is not a complete analogue of checking for "200 OK". In general, to the point: not so long ago a wonderful opportunity appeared in Hrumer - to rob information from resources at the time of the project run. It looks like this. You drive in a template, which will be processed in the process, and the information collected from the template will be entered into the xgrabbed.txt file in the Logs folder. You can use this function for anything, the flight of imagination is huge. I use this function once a week to remove the "expired" links from the working database. It's no secret that forums die off every day in order to clean the base from such resources and the "Autograbbing" tool will help us in this case.
After all, you must admit, often typing, for example, http://www.laptopace.com/index.php, we see that this domain is already, for example, gaddyad, but there is no forum there. So in order to throw this slag out of the base, we will loot. :) Open the source code of the page and see this entry there:

laptopace.com

For grabbing, transform it into

[...]

Now all the "dead" from the goudaddi will be known to us by name.
Here is a small selection for the "Autograbbing" tool, if you want to clean the database from different "expired" domains:

[...]

[...]
[...]
[...]

[...]

This domain may be for sale. [...] Buy this Domain