Lucene supports storing inside a field (Field.Store.YES), with the cost of a bit more memory used. The gain would be simpler code because we don't have to serialize the tweets separately. Given the low efficiency constraints, this should not be a problem. If it becomes one, I suggest switching to a "real" key-value store.