Discuss the articles posted on Dev.Opera.
By pavel.studeny
Thursday, 10. April 2008, 20:50:06
Indexing and searching in Opera - visited pages search
In this article, Pavel Studený lifts the lid off Opera Quick Find History Search, an exciting new Opera feature that allows you to search the full text of previously-visited pages.
( Read the article )
By FataL
Tuesday, 15. April 2008, 18:01:08

I have a question regarding indexing algoritm...
Why Opera 9.5 doesn't include parts of URL in index?
For example, I very often read
Desktop Team blog. It has word "
desktopteam" as part of URL. When I type
desktopteam inside address field I get no results. It's strange to have all page content indexed, but not URL...
Any comments on this issue?
By pavel.studeny
Wednesday, 23. April 2008, 15:40:40

Only domains are indexed from a URL. It might change.
Furthermore, some types of page, such as https, are not indexed to protect your privacy.
By kamalesh
Thursday, 24. April 2008, 00:25:46

Great article on another cool Opera innovation, Pavel.

Can you explain whether the results from Opera History Search should be identical as the sort order from the address bar? It appears that OHS returns by strictly relevance, while searching from the address bar sorts by relevance AND most recent. Am I seeing that correctly?
(This is in Build 4784, not sure if you'll further tweak this in subsequent builds.)
By FataL
Thursday, 24. April 2008, 18:41:47

Originally posted by pavel.studeny:
Only domains are indexed from a URL. It might change.
Would be great!

Originally posted by pavel.studeny:
Furthermore, some types of page, such as https, are not indexed to protect your privacy.
That's fair, but I would make it depends on
Cache HTTPS option (if I decided to cache HTTPS pages, why not allow to search them).
By olli
Friday, 25. April 2008, 14:54:30

"Furthermore, some types of page, such as https, are not indexed to protect your privacy."
IMO that is a stupid bug
By pavel.studeny
Tuesday, 27. May 2008, 15:35:40

Yes, the results can be sorted in a few different ways. It is sorted by relevance in opera:historysearch, while the address bar is tuned for the best performance, for case you typed too quickly.
By HaJotKE
Banned User
Tuesday, 27. May 2008, 15:57:50

Originally posted by olli:
IMO that is a stupid bug
Completely agreed...

At least worth an option in Preferences Editor (opera:config)!
By Profnovice
Tuesday, 3. June 2008, 11:00:09

I am, one question concerning indexation!!!
Whether it is possible to index all content of pages, without the name of url in content of pages?
By eestlane
Tuesday, 15. July 2008, 12:49:21

A good article. However, I didn't read out how the rank of any word is calculated before it goes into the database?
I'm planning to develop something similar with php and mysql.
By gnpk
Tuesday, 23. September 2008, 19:05:51

Can you explain a little bit on how the rank for a particular word is created. There might be words that might be skipped in the ranking (for example, a word like the in English). But then, that is very much language dependent. As mentioned in the post, there might be languages which might not have such grammatical elements (btw, I didn't know about the lack of spaces in Japanese and Arabic - thanks for that). So, how do you pick which words to be part of the inverted index.
Also, if you can explain how you pick words in non-English (specifically Asian languages) it will be very interesting to know.
Gangadhar
By jeremyhudson
Friday, 21. November 2008, 15:45:09

A really useful new feature in the Opera browser, although it is still limited in the area of URL indexation as it was already pointed out above, such restrictions call for at least a more customizable preferences. Still, it's good to see that browser developers don't rest on their laurels.
By kamalesh
Friday, 21. November 2008, 16:38:18

What limitation, Jeremy? Opera can find the web page title, all the contents of that page, and the entire URL path.
(And it's not new, since it's been available since April now; though, I guess it's new if you use Firefox's half-"awesome" bar or Chrome's "One Box" addr bar...)