Rainman
Fullwidth characters
editSee discussion at Wikipedia:Village_pump_(technical)#Fullwidth_to_standard_width_redirects. It may interest you. This, that and the other [talk] 02:17, 11 May 2008 (UTC)
Flag of Macedonia (1991-1995)
editHello Rainman, since i do not know how to draw flags I'm asking you, if you could draw the '91-'95 Macedonian flag where the sun rays are of the same length. I have found just a small version. could you draw the flag acoording to the other flags of wikipedia? thank you Korpas (talk) 15:38, 6 January 2009 (UTC)
modification to search several Wikipedian sections at one time
editrainman, did you have a chance to add this modification to the search? (as per: Village Pump Search Question) --stmrlbs|talk 20:52, 15 June 2009 (UTC)
- rainman, I just got your message, and wanted to thank you! I also had some questions. Is the software updated on a regular basis? and is the search open software? or can it at least be viewed? Thanks. --stmrlbs|talk 01:30, 16 June 2009 (UTC)
- This is fantastic! and so quick! Thanks for the other information about the search, and I will be sure to update the documentation after I try it. I've written enough software documentation to choke a horse, so I will be glad to do that. --stmrlbs|talk 17:36, 16 June 2009 (UTC)
barnstar for a great Search enhancement!
editThe Technology Barnstar | ||
This barnstar is for listening to our problem and so quickly coming up with a great Search Enhancement! Thank You!!! stmrlbs|talk 05:59, 17 June 2009 (UTC) |
You might be interested the RFC about letting google index the User Pages
editI thought you might be interested in this RFC: Wikipedia:Requests_for_comment/User_page_indexing because the reasons given for indexing the User pages involve the wikipedia search versus google. --stmrlbs|talk 23:25, 29 June 2009 (UTC)
rainman, what are the differences in the way wikipedia searches and Google does? I figured since you've worked in this area, that you would be more aware of the differences. People think they are going to find all instances of a word on the net with a google search, but that is not always the case. You still sometimes have to do a separate search to find all instances of something on wikipedia, whether it is a google site search or a wikipedia search. examples:
- google search for horticulture - 1 result (found on wikipedia)
- wikipedia search for horticulture - 6,015 results
- google search for tiger - 2 results shown from wikipedia with 1st search / click "more results from this site" to get rest - 44,500 results
- wikipedia search for tiger - 59,027 results shown from wikipedia. A difference of over 14,500 more results from the wikipedia search.
One thing I've noticed is that Google seems to ignore for the most part the EXTERNAL LINKs section. Have you noticed this? Example:
- wikipedia search for "freenetpages.co.uk/"
- google search for "freenetpages.co.uk/" on wikipedia It finds a link in footnotes, but not in external links.
(we can just continue the conversation here as I have you in my watchlist now :)) --stmrlbs|talk 03:49, 1 July 2009 (UTC)
- Yes, well I didn't do any systematic research into this, although would be interesting. One thing I noticed is that sometimes the page is just absent from the Google index, and while certain words act as keywords for article, other just don't. For instance, for most of 2008 Google was blind to this article simple:Douglas Adams. Whatever you did, edited the page, tried different searches, it was just not in the index. And you already found a case with external links when Google is blind to that part of the text, but this can happen to other parts of the text as well. There is either a very sophisticated algorithm that deems these words useless, or there is some kind of a temporary failure in google's internal architecture that makes the results not show up. --rainman (talk) 09:41, 1 July 2009 (UTC)
- hmmm.. that's interesting. I wonder what other pages are left out. It would be interesting to run a google search of a common term against a wikipedia search of that same term, save the results, and run a utility compare to see the differences.
- I have found that google does not necessarily index every page on a site, even though they "index" the site. A friend of mine had an open forum. He wanted to use the "site google search" facility instead of the search that came with the forum software for performance reasons. But, google would only index about 10% of his site. I could not figure out why.. the forum changed regularly (with new posts), and the subjects were interesting. I knew other forums with the same software that were immediately indexed. He didn't have ads or link farms, or anything like that. But, google just didn't want to index his forum for some reason. The only posts that got indexed were those posts linked to from somewhere else.
- rainman, I've always used a couple of different searches if I really need to find something that is a bit obtuse. There used to be more differences between the different search engines.. they are getting more homogenous as time goes on. But there are still differences. I think a lot of people think google finds everything, but it does more decision making as far as relevance, and depending on what you are looking for, it can totally miss it, if it is a webpage, or part of a web page deemed as "irrelevant". I notice how google was blind to external links when I was trying to find all articles which linked to a defunct website. It gave me back no results when I knew there were many articles that had this link. I like google, and think it is one of the most user friendly search engines - but I know that sometimes I can't depend on it to find everything. --stmrlbs|talk 20:16, 1 July 2009 (UTC)
- Stmrlbs, I just noticed the Google searches you propose above aren't of Google searches of Wikipedia alone. Using a Google search of only Wikipedia, you'll get 4,350 hits for "horticulture". That's why I love Google Toolbar, which as a button for searching websites. One time I was searching for a name and only a few sites turned up. I then searched using the website search button. Several times I've experienced that Google can thus access information on locked websites, IOW it bypasses the login security feature. This time it found the name, but very deep in the website where pot growers in British Columbia discussed their growing methods, how to smuggle it across the border into the USA, etc. Everyone was on a first name basis. Obviously the name I was searching for was someone I didn't know. I then got curious and "backed out" of the website and found no other content. Even the index page was blank. This was apparently their "secret" meeting place, but Google found it. Anyway, I hope that information about how to use Google to deep search one website is helpful. -- Brangifer (talk) 14:21, 1 July 2009 (UTC)
- that's not good. Google ignoring security requirements. It sounds like the site didn't have the login security properly set up (I hope that was the reason - rather than just bypassing security). The problem with Google indexing areas like this is that as thrilled as you were that google bypassed security, this kind of thing works both ways. Do you want some person googling and finding your bank history online, or something like that - because google is ignoring security? But, this doesn't sound like something google would do intentionally. I hope not.
- As for my example, the reasons given in the RFC for not using wikipedia search is that you can find all the results on multiple websites with one fell swoop of google. I showed with these examples that this is not the case, that if you want google to find all the instances of a word on a website, you will still have to do a site search of that site. So, that is multiple searches, not one search.
- as for the 4,350 results returned for a toolbar site search of wikipedia, my google site search - using the advanced options of the regular google search - for horticulture in wikipedia returns 6,480 results!! The Wikipedia search for horticulture returns 6,015 results. So, there are differences between the regular google site search, the Wikipedia search, and your googlebar site search, with your google bar search turning up the fewest results.
- So, which is the best search? Imo, even though I am definitely for anything to improve User friendliness, the ease is not the only factor in what makes a search a good search. With the algorithms being for the most part hidden in the popular search engines, sometimes it is hard to figure out what results are being dropped and for what reasons. Rainman has worked internally on the wikipedia search, and therefore, I thought he would be the best person to ask about those differences. If possible, I would like to document some of the differences so that people can make better decisions as to which search to use - depending on what they are looking for. Regardless of what comes out of the RFC.
- --stmrlbs|talk 19:00, 1 July 2009 (UTC)
- Unfortunately I only have anecdotal evidence, although it would be interesting to do more in-depth analysis, e.g. by using all wikipedia and all google hits for keywords and seeing in which articles they differ. Comparing numbers of hits can be tricky since wikipedia search gives exact numbers, while google gives an approximate number, so they are not really comparable for large number of hits. --rainman (talk) 00:49, 2 July 2009 (UTC)
- yes, just comparing straight numbers wouldn't be enough. But if there was a big difference in the numbers, then I would think it would indicate something. Probably the fact that google doesn't check the external links accounts for a lot, but if there was a way for the wikipedia search to turn off external link checking (or checking of any section with a certain name), then it would even it out a bit, and allow checking for other big discrepencies. But, I realize this isn't exactly high priority on anyone's list.. but I'm curious what the differences are now. --stmrlbs|talk 01:32, 2 July 2009 (UTC)
- Unfortunately I only have anecdotal evidence, although it would be interesting to do more in-depth analysis, e.g. by using all wikipedia and all google hits for keywords and seeing in which articles they differ. Comparing numbers of hits can be tricky since wikipedia search gives exact numbers, while google gives an approximate number, so they are not really comparable for large number of hits. --rainman (talk) 00:49, 2 July 2009 (UTC)
just wont to let you know that something is wrong with the new search
editadding the check ALL/NONE box really makes the setting up the search a lot easier. But, something seems to be wrong with the advanced search where all the boxes are checked --stmrlbs|talk 01:38, 2 July 2009 (UTC)
- Thanks, fixed. --rainman (talk) 11:41, 2 July 2009 (UTC)
- I like the changes. Are they documented anywhere? Are there any changes that are not apparent on the new search page? --stmrlbs|talk 19:42, 2 July 2009 (UTC)
Sitemap question
editrainman, does Wikipedia have a Sitemap defined for the major search engines? Is it defined somewhere? I was wondering how the User Pages are defined (as far as priority) currently in this sitemap - if there is one. --stmrlbs|talk 19:45, 2 July 2009 (UTC)
- We used to have some sitemaps, but I think no search engines were actually using them. I would imagine any serious search engine would have a special module just for parsing and updating wikipedia, so our unreliable sitemaps would very probably be ignored anyway. --rainman (talk) 11:14, 3 July 2009 (UTC)
- Who would know for sure? Where should I be going to ask this question? -stmrlbs|talk 12:36, 3 July 2009 (UTC)
- No idea, I have a vague recollection of it being removed at some point, prolly should search through fixed bugs in bugzilla.wikimedia.org and in code commits on mediawiki.org. --rainman (talk) 14:42, 3 July 2009 (UTC)
- Thanks, rainman. I will look there. :) --stmrlbs|talk 17:25, 3 July 2009 (UTC)
- No idea, I have a vague recollection of it being removed at some point, prolly should search through fixed bugs in bugzilla.wikimedia.org and in code commits on mediawiki.org. --rainman (talk) 14:42, 3 July 2009 (UTC)
- Who would know for sure? Where should I be going to ask this question? -stmrlbs|talk 12:36, 3 July 2009 (UTC)
Cambridge meetup 1 August
editFYI, the fourth Cambridge meetup will occur on the afternoon of Saturday 1 August. Charles Matthews (talk) 14:02, 26 July 2009 (UTC)
Cambridge meetup 14 November
editAnother Cambridge meetup is planned for the afternoon of Saturday 14 November. Please contribute to the page and come along if you can. Charles Matthews (talk) 14:32, 16 October 2009 (UTC)
Search page
editHi, can you have another look at WP:VPT#Search page (if you're not watching it anyway)? We really need a way (as I think I mentioned before) of adding help links on the search results page. (Normally things like this are done through a MediaWiki: page, so that each project can customize its own interface, but in that thread you say there is no such page in this case.)--Kotniski (talk) 16:58, 30 October 2009 (UTC)
- Sorry, could you just clarify this for me? I'm not clear on the status of MediaWiki:Searchsubtitle and MediaWiki:Searchresults-title. Are they in use, or now redundant? They're listed at Special:Allmessages, but I can't see when they're shown. Thanks. Rd232 talk 08:14, 31 October 2009 (UTC)
- I'm not familiar with Translatewiki so I'm sticking with my plan, which was to put {{MediaWiki redundant}} on their talk pages. Rd232 talk 10:45, 31 October 2009 (UTC)
thought: if nothing at all matches the search query, MediaWiki:Search-nonefound is shown in addition to MediaWiki:Searchmenu-new. But the latter message refers to "checking the search results below". Perhaps Search-nonefound should be shown instead of Searchmenu-new, with the "create page" option added there? Rd232 talk 16:10, 1 November 2009 (UTC)
Hi. Could a link to the help be provided for Special:Search like at Special:Contributions ? I've proposed this at WP:VPT. We'd need a mediawiki page to specify the link I suppose ? Cenarium (talk) 18:58, 5 November 2009 (UTC)
- Yes, I imagine we would need something like MediaWiki:Sp-contributions-explain on the Search page, maybe you should comment on the bug 21391. Unfortunately, I'm extremely busy at the moment, so I cannot contribute towards programming this, although it is an easy interface tweak. --rainman (talk) 11:14, 6 November 2009 (UTC)
question about Logs Search
editrainman, why is it necessary to have to enter the exact title of an article - exact including punctuation - in order to find an article in any of the logs? Especially the Deletion log? Why can the search not find a keyword in the title? That would be so much more user friendly. You will see a lot of people asking "what happened to my article" because they can't find it in the deletion log. I couldn't find a title because I searched for "Searching for the Wrong Eyed Jesus" [1] instead of "Searching for the Wrong-Eyed Jesus" [2]. Ack! Thanks for any help you can give. stmrlbs|talk 03:06, 17 November 2009 (UTC)
- Thank you, rainman. I submitted a bugzilla report - hope I did it correctly. The number is 21555. (I am not asking for a commitment on your part - this is just to let you know if you want to take a look). stmrlbs|talk 03:15, 18 November 2009 (UTC)
Indexing question
editThanks for your note at VPT indicating that the indexer had stopped, and would be restarted. I'm thinking that hasn't happened, and I'd like to explain why I think this, so you can let me know if my thinking is off-base. I first assumed that the indexer needs to make sure every new page title is in the database, so it would crawl through the new page list chronologically. Then I realized that the indexing is full text, so it has to reflect any change to a page, so perhaps it crawls through the recent change list, of which the new pages are a subset. In either case, my starting assumption is that, at any point in time, there is an earlier point in time such that all changes prior to that time are indexed, and all changes after that time are not yet indexed. I opened the New Page file, did a binary search looking for titles recognized, and identified Reynoldston, New York, created at 22:57, 19 May 2010, as in the database, and the next created file Blanket High School, created at 23:46, 19 May 2010, as not in the database. I did that check Monday evening. Tuesday morning, I checked again, saw that Blanket High School was still not indexed, and surmised that the indexer was not doing anything.
You confirmed my guess, and said it would be restarted.
However, this evening, I just checked again and it is still not the case that Blanket High School is indexed. It occurs to me that perhaps with a restart, it doesn't pick up where it left off, so maybe it is indexing away, but in a different section of changes. However, you can imagine that my first guess is that the indexer would start exactly where it left off, so the fact that it still hasn't indexed Blanket High School leads me to wonder if it really is in action.
I hope I'm not being too much of a pest, but I am working closely with a new editor who has created a fine new article, Terrain Gallery, and I'd like to report to the editor when the file is indexed. Is it possible that the indexer is not yet working, or is it the case that my approach to testing this is flawed?--SPhilbrickT 22:43, 25 May 2010 (UTC)
Meetup
editYou have seen the site notice about the next Cambridge meetup, this Saturday 29 May. I thought those who have been in the past should have a personal invite, too. And reminder! Charles Matthews (talk) 21:39, 26 May 2010 (UTC)
Meetup Cambridge 8
editWikipedia:Meetup/Cambridge 8 will be on Saturday 24 July. Hope to see you there. Charles Matthews (talk) 20:39, 21 July 2010 (UTC)
Delay in Updating Search Index
editThe Search Index has not been updated since 12 August - i.e. 6 days ago.
I have reported that here but nothing has happened.
In April you resolved this, as reported here. Any chance you can look at this again? or tell me who to contact? - Thanks
Arjayay (talk) 07:59, 18 August 2011 (UTC)
- Thanks Arjayay (talk) 08:08, 23 August 2011 (UTC)
Problems with search results
editOn 18 October, I explained an increasing number of problems with search results on the Wikipedia:Village pump (technical) page. Briefly:-
- The number of matches varies, and matches disappear
- Some matches only show the article title, with no detail
- Matches move about
- False matches
You replied with this diff [3]
including The problem was with search9 which had a stale version of one of search index slices.
The problem has returned. A search for "refered" [4] alternates between 11 matches, 6 of which I corrected 6 days ago (5 are in URLs etc so cannot be corrected) and 12 Matches which I have just corrected the 7 new cases.
I suspect there is another "stale ... search index slice" - whatever that means. Could you look at this again please.
Arjayay (talk) 18:39, 9 November 2011 (UTC)
- Hi Arjayay, thanks very much for the report. You were indeed right, and should be fixed now. --rainman (talk) 13:58, 10 November 2011 (UTC)
- Thanks - hope you don't mind me contacting you directly, but you seem to resolve most search problems in anycase. - Arjayay (talk) 14:48, 10 November 2011 (UTC)
Weird search results
editI have reported this at Wikipedia:Village_pump_(technical)#Weird_search_results but thought a direct notification to you might help. Arjayay (talk) 09:25, 23 November 2011 (UTC)
Search results
editSearch results are not including changes I made on 25 February, whilst refreshing a search generates different selection of results. Looks like the old "stale version of one of search index slices" again.
Could you look at this please. Thanks - Arjayay (talk) 19:27, 29 February 2012 (UTC)
Updating Search Results
editThe search index has failed to update for three days.
I have reported this at Wikipedia:Village pump (technical) as instructed in Help:Searching#Delay_in_updating_the_search_index.
I don't know whether this is a separate problem from the "stale version of one of search index slices" or not.
Arjayay (talk) 19:06, 11 March 2012 (UTC)
Delay in updating the search index (again)
editSorry, it's me again.
After a couple of days of random search results (different numbers of matches when pressing refresh - usually typical of "stale slices") the search index has stopped updating; causing a backlog for us Wikignomes. I have reported this at Wikipedia:Village pump (technical) but usually get a quicker response by messaging you directly. Thanks - Arjayay (talk) 16:15, 7 September 2012 (UTC)
Hi Arjayay, Search indexing was stopped for a while in the past 24 hours. I had to turn it off so that I could migrate data from one of our data centers to the other. Sorry for the inconvenience! It should be back up and fully up to date now. If the problem you were expereincing is still present, please let me know! Peteryoungmeister —Preceding undated comment added 16:33, 7 September 2012 (UTC)
- Thanks for the message - the index does seem "stable" now i.e. the same number of matches, in the same order, on each refresh. It is not totally up to date - i.e. spelling mistakes corrected 24 hours ago still appear in lists of misspellings - hopefully this will correct overnight? (I'm in the UK = UTC 1} - Arjayay (talk) 16:49, 7 September 2012 (UTC)
Disambiguation link notification for June 21
editHi. Thank you for your recent edits. Wikipedia appreciates your help. We noticed though that when you edited Förster resonance energy transfer, you added a link pointing to the disambiguation page Dipole moment (check to confirm | fix with Dab solver). Such links are almost always unintended, since a disambiguation page is merely a list of "Did you mean..." article titles. Read the FAQ • Join us at the DPL WikiProject.
It's OK to remove this message. Also, to stop receiving these messages, follow these opt-out instructions. Thanks, DPL bot (talk) 08:54, 21 June 2014 (UTC)