Results 1 to 10 of 10

Thread: Need help and your time, for nothing in return!

  1. #1
    Join Date
    Mar 2008
    Location
    pd
    Posts
    497
    Rep Power
    147

    Need help and your time, for nothing in return!

    I hope I don't hit the 'reverse psychology' jackpot... I've been working on a project at spider_dot_my*, just because I was interested to find out about a couple of things, and I need some more 'real' data.

    If you've got a website that you wouldn't mind allowing an experimental and possibly broken spider to crawl, and you can edit your robots.txt to insert a token for verification, I'd be very grateful if you would submit one.

    Like I said, it's just a hobby at the moment, so I can't offer fame, much less fortune!

    Looking forward to seeing the errors your sites bring!
    +
    *Still not working well enough to want external links, sorry. When spider_dot_my is working a bit better, there will only be a few searchable pages. Your sites should appear on the "where's my spider page", and (if I decide to go with it) a "supporters" page.

  2. #2
    Join Date
    Mar 2008
    Location
    pd
    Posts
    497
    Rep Power
    147
    Replying to my own post, oh dear. Maybe I didn't make it attractive enough!

    spider.my - home of My Spider is a search engine, just to try out some ideas. I need site owners to submit websites. It currently has a very small database, so I can guarantee top ranking in search results for any sites submitted!

    In addition, several 'proper' search engines have already started crawling its status page, where recently crawled sites are linked. Your sites are very likely to appear as links on that page too.

    Status page: spider.my - home of My Spider

    Submit a site: http://spider.my/UserPage

    I'll be writing some analysis pages for admins soon, but I'd be happy to give manual feedback to any WMM submitters for now. Thanks!

  3. #3
    Join Date
    Nov 2007
    Location
    Tropicana, PJ
    Posts
    17
    Rep Power
    0
    are you still testing your spider/crawler?

  4. #4
    Join Date
    Mar 2008
    Location
    pd
    Posts
    497
    Rep Power
    147
    are you still testing your spider/crawler?
    Yes! Well, not so much the spider. I've been working on generating decent snippets and fixing some 'issues' between my webserver and the Apache proxy in front of it. Lots of distractions recently, so working slowly.

    The spider is still missing a few features that stop me from letting it discover its own sites, but I'm hoping to let it follow links from the small number of submitted sites in its current database sometime in September. Since WMM is in the database, WMM members' sites should be among the first to be crawled.

    I'm not promoting the site yet, as it lacks a lot of features. Every site submitted does help me fix more bugs in the spider, indexer and search, so I'm very grateful to submitters! One of the things I'm intending to do in the next few weeks is to improve feedback for admins, so I'll be begging for testers again then.

    If you have a site you wouldn't mind the spider visiting, then I'd be interested to hear what you think of the submission process. I haven't had much feedback on that. The 'edit robots.txt' requirement might be a bit strict, but at the moment I want to avoid people submitting YouTube, Amazon and Wikipedia!

  5. #5
    Join Date
    Nov 2007
    Location
    Tropicana, PJ
    Posts
    17
    Rep Power
    0
    Ok, Thats actually really cool that you've came up with itsy bisy spidy thingy.. it really interests me .. How can i be in help like a tester .. I might want to submit my blog to your engine.. I would like to see how it actually see the contents.. let me know

  6. #6
    Join Date
    Mar 2008
    Location
    pd
    Posts
    497
    Rep Power
    147
    Thanks for the offer. To submit a site, you have to 'sign in' - there's a link at top right in the heading box. All you need to do is enter your email address and the word 'new' in the password box. You'll get a password by email.

    It all looks a bit bare at the moment, sorry about that. I'm not much of a graphic designer, but I'm also deliberately trying to keep it 'minimal'.

    When you sign in, you can go to your 'user page' using the link at top right (it will be your email address if you've signed in). You can add a site there. At the moment, to verify a site you have to be an admin for the site, and you must be able to modify the site's robots.txt - impossible for blogs on blogspot.com, for instance. If you'd just like to see your site added to the spider.my index, I can edit the database.

    When I've got some site-owner feedback (next week or so), it should be a bit more interesting. On my TODO list is list of all keywords and their frequencies / internal link structure / sitemap.xml and robots.txt data and suggestions. Soon enough after that, I want to provide stats on performance in searches (where your site was listed, how often it was clicked etc).

    There's a long way to go! You'll notice lots of unfinished stuff when you look closely, today I broke the layout - I don't really understand CSS yet either! It's sort of back in shape, pending time to research CSS more deeply.

    I notice one 'feature' that might take me forever to remove is the Cached versions currently include original scripts verbatim. That means I display other people's pay-per-click ads! OK, a secret benefit to being an early tester...

    Let me know if you can't edit your blog's robots.txt, and I'll add it to the DB directly.

  7. #7
    Join Date
    Nov 2007
    Location
    Tropicana, PJ
    Posts
    17
    Rep Power
    0
    Hi, No worries, i can edit my robots.txt , i just need to know what i need to insert. Hey don't worry about the interface or looks at this moment.. I know how google look like when they started

    Ok I'm gonna add my site as soon as I get back home
    Last edited by ruhani; 18-08-2008 at 06:09 PM. Reason: forget to add something

  8. #8
    Join Date
    Feb 2008
    Location
    MLK
    Posts
    405
    Rep Power
    147
    Interesting. Trying it out now.

    EDIT: Just to notify you that the new password mail went into my Junk folder. This isn't going to be good.
    Last edited by JLHC; 18-08-2008 at 07:43 PM.
    Aspiration Hosting
    The 1st High-Performance Hosting Solution in Malaysia

    Coming Soon......

  9. #9
    Join Date
    Mar 2008
    Location
    pd
    Posts
    497
    Rep Power
    147
    that the new password mail went into my Junk folder. This isn't going to be good.
    Got an impression as to why that was? I imagine it may be because the domain is hosted on a dynamic I Paddress - some mail servers reject mail from lolyco.com outright. If there's some other inadequacy in the mail headers that I can change, then let me know and I'll add it to the list. There's not a lot I can do about the dynamic IP address, except to rent an IP address or some respectable mail hosting. There's no way spider.my can be run from my DSL connection in the long term, so I hope some dedicated or colocated hosting should also sort out the 'junk originator' problem.

    Thanks for giving me the heads-up. Like the site says, it's a work-in-progress, so I'm not expecting it to pass the "hey, it really works" test for a few months yet. If you've tried a search, you'll see where it's up to. Not far.

    I stopped the spider process some time ago - it has a 'quirk' to do with re-checking files that don't warrant it. That bug is on a long list of other trivial-to-annoying bugs, so I just stop the process when I'm not looking at it to prevent it upsetting any admins. It's running again now, so I'm hoping you'll see a huge slug of fetches on your stats soon. I don't see spider.my's spider in my stats, as everything is on the same network, so I exclude local traffic from the stats. I'd be interested to know if your stats identify the spider as a bot of some kind.

  10. #10
    Join Date
    Jul 2009
    Location
    malaysia
    Posts
    19
    Rep Power
    0
    i can give the data you need..PM me if interested...

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. High Profits Return!
    By spi2u in forum Other Webmaster-related Services and Promotion
    Replies: 0
    Last Post: 13-07-2009, 03:02 PM
  2. return value from oracle..
    By shakker in forum Website Programming
    Replies: 0
    Last Post: 23-08-2004, 09:38 AM
  3. Small Investment, Big Return!!!
    By campb in forum Mamak Stall
    Replies: 2
    Last Post: 22-08-2004, 06:38 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  

Search Engine Optimization by vBSEO 3.5.0 RC1 PL1

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30