Originally reported in: openaire/iis#1326
Documents similarity algorithm fails after running it on a non-deduplicated OpenAIRE Graph counting 300M of publications (deduped graph included 200M).
After in depth inspection covered by the openaire/iis#1326 (comment) it turned out we need to modify documents similarity sources by increasing allowed timeout value which should be defined in sim1-postprocess-s1-e1-filter-sims.pig PIG script.