John Battelle pointed out on his blog a technical challenge we have at Indeed: handling duplicate job listings. Many job sites have the same job from the same company posted multiple times. Also, the same job is often listed on different jobsites. As a user, you don’t want these returned in your search as unique jobs.
Indeed’s search engine has a duplicate handling mechanism whereby identical jobs get bundled together under one listing. For example, a search for Journalist jobs in New York City yields several such bundled results. Links to duplicate jobs are displayed below each job’s description snippet.
Even though we are using some advanced techniques, identical jobs don’t always get bundled in this way; for example, due to formatting differences between job sites. We are continuing to refine the duplicate handling process to deal with such situations and ensure the best possible job seeker experience.