MongoDB - slow '$group' performance -


I have a MongoDB archive of more than 1,000,000 records. The size of each record is around 20 (hence the total storage size is about 20 GB).

I have a 'Type' field in the collection (which can be approximately 10 different values) I would like to get a counter-type counter for the archive; In addition, there is an index on the 'Type' field.

I have tested two different methods (consider the Python syntax):

A naive method - Each value to use the 'count' call:

For type_val in
  my_db.my_colc.distinct ('type'): Counter [type_val] = my_db.my_colc.find ({'type': type_val}). ()  

Use the aggregation structure with a '$ group' syntax:

  counter = my_db.my_colc.aggregate ([{'$ group' : {'_ID': '$ type', 'AGG_val': {'$ sum': 1}}}])  

The performance I am receiving for the first approach is Approximately 2 orders, more intensity than the second approach is likely to be related to the fact that the counting only runs on the index, without access to documents, while $ group on documents One has to go up one-by-one (this is about 1min versus 45 amin.)

Is there a way to run an efficient group query on the 'index', which only uses the index Will, thus achieving performance results from # 1, but using the aggregation framework?

I am using MongodiBi 2.6.1

Update: On this issue Mongodebi is open in Jira.

In the aggregation pipeline, $ group does not use section index, it is used after match $ , Which can actually use indexed to speed it up.

Cheers,


Comments

Popular posts from this blog

java - org.apache.http.ProtocolException: Target host is not specified -

java - Gradle dependencies: compile project by relative path -

ruby on rails - Object doesn't support #inspect when used with .include -