I logged in early this morning to find 23 “Guests” on the directory. Further investigation showed all the Guests were from the same IP just moving fast and all instances were trying to leave comments on listings. IOW a bot, a comment spam bot.
Anyway I banned the IP. It’s not something I like to do because innocents get caught up in bans but this is the kind of stuff that ruins the web for everybody.
It was no surprise to me that a day or two after signing Indieseek.xyz up for Yandex Metrica, Yandex’s answer to Google Analytics, that Yandexbot showed up and started indexing the site in depth and especially the directory. And Yandexbot keeps coming back for more. Yandex is right up front about their mission, they provide this extensive analytics service because they are a search engine company and this helps them discover new web pages – IOW, same reason as Google.
I’ve used Yandex Metrica on my blog site for several months and have been very pleased. They even have a WordPress plugin.
The reason I don’t use Google Analytics, is Google already knows enough about me and my websites, I don’t trust them and I’m certainly not going to give them inside information on my website traffic. I don’t totally trust any search engine company but I like to break up information about my websites into separate silos. Yandex is free and more powerful than GA and is real time and GA isn’t.
Google already knows too much about you, give Metrica a try.
Back to the spidering: I get very little traffic from Yandex and I don’t expect much. But when I do a site search on Duckduckgo.com I notice a little logo that says results enhanced by Yandex. While I can’t prove it, my theory is that DDG uses Yandex for really deep site search crawling. So being in Yandex’s index may have benefits unseen. It can’t hurt.
A few weeks ago I wanted to offer a copy/paste searchbox here on Indieseek.xyz so other webmasters could offer a search box on their site. The problem was I had no idea how to code it. I can edit some HTML but not devise it from scratch.
Then I remembered, I had offered a search box on one of my niche directories back in 2004 and I was using an earlier version of the same directory script.
Would Archive.org‘s Wayback machine have preserved that?
Would the code still work?
I pulled up my old directory on Wayback. Found the link to the ancient “Link to Us” page. Would this work? I clicked on it. After a long delay the page came up. And there was the copy/paste code! I copied it and saved it in a Notebook.
Now would it work? I edited the domain and quick slapped it up on a server page and It Worked! That search box code made by a guy I hired back in the early Aughts was still good.
I won’t bore you with my frantic quest for a HTML guide to learn how to code a textarea and all that, But an hour later I had my new Link to Us page up and running complete with copy paste codes.
I think the ghosts on my long dead site, once my flagship directory, were looking out for me even after all these years. Web 1.0 to the rescue.
It’s always neat on a new domain and website to see which crawlers (aka Spiders, Robots) find you and what they do.
Indieseek.xyz is lousy with crawler bots.
Google found us first. Within 24 hours of my first testing posts (either a webmention to another site or when I added a link back on my Twitter profile, Googlebot was all over the site. Googlebot is very competent and well behaved, but voracious. This is to Google’s credit, that is what a search engine spider is supposed to be. Google is always keen to find new sites. Every day since then one, two sometimes 3 Googlebots have been in the directory, getting into everything. They almost camp out there.
Bing or a bot pretending to be Bingbot found us second but a few days later. I’m not sure is it is the real Bingbot because it mearly checks the index page and leaves a query string of “amazon”. It came beck several times but I never detected it going deeper. Has not been back in awhile. Bing always seems to hold back on indexing. You can check one week and they only have your index page. You check again weeks later and they have 6 pages, a few weeks more and they have 6 more.
A few days after I started cross posting to Twitter the spambots showed up. Some I can only detect because they try to leave spam comments. So far the defenses are holding on those. The others are better behaved: one from a popular SEO tools site has been, almost as voracious as Google in checking out the entire site, another which may be somebody’s experiment keeps coming back, and the third, unnamed, is from Asia and keeps coming back and just hanging on my help page. Weird. The later three are not of any value to me, but I’m okay on bandwidth so I let them carry on.
So far, no traffic from any search engine. That does not surprise me as Indieseek has almost no inbound links.
The plan of action, is to keep writing posts in the Indieseek blog and keep adding listings in the directory and what will be will be.
In the old days, the Index page was the formal front door to a website and the same holds true today. But modern search engines don’t care as much about the index page so generally we teleport into some page deep in the interior of a website so the major doorways don’t really matter.
But for bookmarks on your many devices Indieseek has several potential doorways. And you can bookmark different doorways on different devices depending on the capabilities of that device and how you like to work with it.
Here are some suggested doorways to bookmark:
Indieseek.xyz (Home) – as minimalist as I could make it. If a plain search box is your preference this is for you. Bookmark on smartphone.
Indieseek.xyz/links/ (Directory) – the directory section homepage. This is perfect for those that prefer to browse and drill down through the categories. You are really at the heart of the directory here. And it also has a search form just in case. Bookmark on laptop, tablet.
Indieseek.xyz/blog/ (Blog) – should you be mostly interested in what I write bookmark this. (Hey, no laughing in the peanut gallery! People read my stuff.) Bookmark on any device.
Of course you are welcome to bookmark any page you want, these are just suggestions. The choice is yours.
Category: Paranormal (and UFO) – not really my thing but fun to read. Good rabbit hole. Back when I had cable TV it seemed like there were always ghost hunter shows on so it might be popular. That’s the thing, it is one of those topics where people are passionate about – passionate enough to build their own amateur websites. Something to contemplate. Might spice things up and keep the index from being too – dry. Research needed.
Category: I need to find some way to deal with general non-fiction. For example, where would I list a publisher of multi topic non-fiction? With small press, micro press and ebook press publishers rising I need to figure out something. Thought needed.
Wiki – I’m toying with the idea of putting up a wiki on the server. 1. I’ve always wanted to play with a wiki, 2. I want to keep notes on ideas for directory building and and search engine building and stuff. Seems like over the winter is a good time to mess with it.