一、简介

这里简单介绍一下各个工具的使用场景,一般用mysql,redis,mongodb做存储层,hadoop,spark做大数据分析。

  • mysql适合结构化数据,类似excel表格一样定义严格的数据,用于数据量中,速度一般支持事务处理场合

  • redis适合缓存内存对象,如缓存队列,用于数据量小,速度快不支持事务处理高并发场合

  • mongodb,适合半结构化数据,如文本信息,用于数据量大,速度较快不支持事务处理场合

  • hadoop是个生态系统,上面有大数据分析很多组件,适合事后大数据分析任务

  • spark类似hadoop,偏向于内存计算,流计算,适合实时半实时大数据分析任务

移动互联网及物联网让数据呈指数增长,NoSql大数据新起后,数据存储领域发展很快,似乎方向都是向大数据,内存计算,分布式框架,平台化发展,出现不少新的方法,普通应用TB,GB级别达不到PB级别的数据存储,用mongodb,mysql就够了,hadoop,spark这类是航母一般多是大规模应用场景,多用于事后分析统计用,如电商的推荐系统分析系统。IAO

看标题,这里是不是跑题了呢,显然不是,了解一下mongodb在存储中的位置还是非常有必要的,explain 和 hint 一看就知道是从mysql借鉴过来的(猜的),实际就是检测查询语句的性能和使用强制索引

二、explain

先写入测试数据

db.test.insertMany([
{ "_id" : 1, "a" : "f1", b: "food", c: 500 },
{ "_id" : 2, "a" : "f2", b: "food", c: 100 },
{ "_id" : 3, "a" : "p1", b: "paper", c: 200 },
{ "_id" : 4, "a" : "p2", b: "paper", c: 150 },
{ "_id" : 5, "a" : "f3", b: "food", c: 300 },
{ "_id" : 6, "a" : "t1", b: "toys", c: 500 },
{ "_id" : 7, "a" : "a1", b: "apparel", c: 250 },
{ "_id" : 8, "a" : "a2", b: "apparel", c: 400 },
{ "_id" : 9, "a" : "t2", b: "toys", c: 50 },
{ "_id" : 10, "a" : "f4", b: "food", c: 75 }]);

写入成功返回值

{
"acknowledged" : true,
"insertedIds" : [
1,
2,
3,
4,
5,
6,
7,
8,
9,
10
]
}

开始查询

> db.test.find();
{ "_id" : 1, "a" : "f1", "b" : "food", "c" : 500 }
{ "_id" : 2, "a" : "f2", "b" : "food", "c" : 100 }
{ "_id" : 3, "a" : "p1", "b" : "paper", "c" : 200 }
{ "_id" : 4, "a" : "p2", "b" : "paper", "c" : 150 }
{ "_id" : 5, "a" : "f3", "b" : "food", "c" : 300 }
{ "_id" : 6, "a" : "t1", "b" : "toys", "c" : 500 }
{ "_id" : 7, "a" : "a1", "b" : "apparel", "c" : 250 }
{ "_id" : 8, "a" : "a2", "b" : "apparel", "c" : 400 }
{ "_id" : 9, "a" : "t2", "b" : "toys", "c" : 50 }
{ "_id" : 10, "a" : "f4", "b" : "food", "c" : 75 }
> db.test.find().count();
10
> db.test.find({ c: { $gte: 100, $lte: 200 }}).count()
3
> db.test.find({ c: { $gte: 100, $lte: 200 }}).explain("executionStats")
{
"queryPlanner" : {
"plannerVersion" : 1,
"namespace" : "test.test",
"indexFilterSet" : false,
"parsedQuery" : {
"$and" : [
{
"c" : {
"$lte" : 200
}
},
{
"c" : {
"$gte" : 100
}
}
]
},
"winningPlan" : {
"stage" : "COLLSCAN",
"filter" : {
"$and" : [
{
"c" : {
"$lte" : 200
}
},
{
"c" : {
"$gte" : 100
}
}
]
},
"direction" : "forward"
},
"rejectedPlans" : [ ]
},
"executionStats" : {
"executionSuccess" : true,
"nReturned" : 3,
"executionTimeMillis" : 0,
"totalKeysExamined" : 0,
"totalDocsExamined" : 10,
"executionStages" : {
"stage" : "COLLSCAN",
"filter" : {
"$and" : [
{
"c" : {
"$lte" : 200
}
},
{
"c" : {
"$gte" : 100
}
}
]
},
"nReturned" : 3,
"executionTimeMillisEstimate" : 0,
"works" : 12,
"advanced" : 3,
"needTime" : 8,
"needYield" : 0,
"saveState" : 0,
"restoreState" : 0,
"isEOF" : 1,
"invalidates" : 0,
"direction" : "forward",
"docsExamined" : 10
}
},
"serverInfo" : {
"host" : "iZbp1g11g0cdeeq9ht9fhjZ",
"port" : 27017,
"version" : "3.4.12",
"gitVersion" : "bfde702b19c1baad532ed183a871c12630c1bbba"
},
"ok" : 1
}

看一下几个关键词

"stage" : "COLLSCAN",

"nReturned" : 3,

"totalDocsExamined" : 10,

全部扫描,不走索引,这里只是演示,所以数据量比较少,如果数据量多起来这样查询将会很慢,甚至会卡死

COLLSCAN

这个是什么意思呢? 如果你仔细一看,应该知道就是CollectionScan,就是所谓的“集合扫描”,对不对,看到集合扫描是不是就可以直接map到数据库中的table scan/heap scan呢??? 是的,这个就是所谓的性能最烂最无奈的由来。

nReturned

这个很简单,就是所谓的numReturned,就是说最后返回的num个数,从图中可以看到,就是最终返回了三条。。。

docsExamined

那这个是什么意思呢??就是documentsExamined,检查了10个documents。。。而从返回上面的nReturned。

创建索引并查询

> db.test.createIndex({ c:1})
{
"createdCollectionAutomatically" : false,
"numIndexesBefore" : 1,
"numIndexesAfter" : 2,
"ok" : 1
}
> db.test.find({ c: { $gte: 100, $lte: 200 }}).explain("executionStats")
{
"queryPlanner" : {
"plannerVersion" : 1,
"namespace" : "test.test",
"indexFilterSet" : false,
"parsedQuery" : {
"$and" : [
{
"c" : {
"$lte" : 200
}
},
{
"c" : {
"$gte" : 100
}
}
]
},
"winningPlan" : {
"stage" : "FETCH",
"inputStage" : {
"stage" : "IXSCAN",
"keyPattern" : {
"c" : 1
},
"indexName" : "c_1",
"isMultiKey" : false,
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 1,
"direction" : "forward",
"indexBounds" : {
"c" : [
"[100.0, 200.0]"
]
}
}
},
"rejectedPlans" : [ ]
},
"executionStats" : {
"executionSuccess" : true,
"nReturned" : 3,
"executionTimeMillis" : 0,
"totalKeysExamined" : 3,
"totalDocsExamined" : 3,
"executionStages" : {
"stage" : "FETCH",
"nReturned" : 3,
"executionTimeMillisEstimate" : 0,
"works" : 4,
"advanced" : 3,
"needTime" : 0,
"needYield" : 0,
"saveState" : 0,
"restoreState" : 0,
"isEOF" : 1,
"invalidates" : 0,
"docsExamined" : 3,
"alreadyHasObj" : 0,
"inputStage" : {
"stage" : "IXSCAN",
"nReturned" : 3,
"executionTimeMillisEstimate" : 0,
"works" : 4,
"advanced" : 3,
"needTime" : 0,
"needYield" : 0,
"saveState" : 0,
"restoreState" : 0,
"isEOF" : 1,
"invalidates" : 0,
"keyPattern" : {
"c" : 1
},
"indexName" : "c_1",
"isMultiKey" : false,
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 1,
"direction" : "forward",
"indexBounds" : {
"c" : [
"[100.0, 200.0]"
]
},
"keysExamined" : 3,
"seeks" : 1,
"dupsTested" : 0,
"dupsDropped" : 0,
"seenInvalidated" : 0
}
}
},
"serverInfo" : {
"host" : "iZbp1g11g0cdeeq9ht9fhjZ",
"port" : 27017,
"version" : "3.4.12",
"gitVersion" : "bfde702b19c1baad532ed183a871c12630c1bbba"
},
"ok" : 1
}

 再看看上面几个关键词

"stage" : "IXSCAN"

"totalDocsExamined" : 3,

瞬间就少了,这样查询时间也会大大减少

三、hint

这时一个很好玩的一个东西,就是用来force mongodb to excute special index,对吧,为了方便演示,我们做两组复合索引,比如这次我们在c和b上构建一下:

创建索引

> db.test.createIndex({ c:1,b:1})
{
"createdCollectionAutomatically" : false,
"numIndexesBefore" : 2,
"numIndexesAfter" : 3,
"ok" : 1
}
> db.test.createIndex({ b:1,c:1})
{
"createdCollectionAutomatically" : false,
"numIndexesBefore" : 3,
"numIndexesAfter" : 4,
"ok" : 1
}

  hint查询

> db.test.find({ c: { $gte: 100, $lte: 200 },b:"food"}).hint({c:1,b:1}).explain("executionStats")
{
"queryPlanner" : {
"plannerVersion" : 1,
"namespace" : "test.test",
"indexFilterSet" : false,
"parsedQuery" : {
"$and" : [
{
"b" : {
"$eq" : "food"
}
},
{
"c" : {
"$lte" : 200
}
},
{
"c" : {
"$gte" : 100
}
}
]
},
"winningPlan" : {
"stage" : "FETCH",
"inputStage" : {
"stage" : "IXSCAN",
"keyPattern" : {
"c" : 1,
"b" : 1
},
"indexName" : "c_1_b_1",
"isMultiKey" : false,
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 1,
"direction" : "forward",
"indexBounds" : {
"c" : [
"[100.0, 200.0]"
],
"b" : [
"[\"food\", \"food\"]"
]
}
}
},
"rejectedPlans" : [ ]
},
"executionStats" : {
"executionSuccess" : true,
"nReturned" : 1,
"executionTimeMillis" : 0,
"totalKeysExamined" : 3,
"totalDocsExamined" : 1,
"executionStages" : {
"stage" : "FETCH",
"nReturned" : 1,
"executionTimeMillisEstimate" : 10,
"works" : 3,
"advanced" : 1,
"needTime" : 1,
"needYield" : 0,
"saveState" : 0,
"restoreState" : 0,
"isEOF" : 1,
"invalidates" : 0,
"docsExamined" : 1,
"alreadyHasObj" : 0,
"inputStage" : {
"stage" : "IXSCAN",
"nReturned" : 1,
"executionTimeMillisEstimate" : 10,
"works" : 3,
"advanced" : 1,
"needTime" : 1,
"needYield" : 0,
"saveState" : 0,
"restoreState" : 0,
"isEOF" : 1,
"invalidates" : 0,
"keyPattern" : {
"c" : 1,
"b" : 1
},
"indexName" : "c_1_b_1",
"isMultiKey" : false,
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 1,
"direction" : "forward",
"indexBounds" : {
"c" : [
"[100.0, 200.0]"
],
"b" : [
"[\"food\", \"food\"]"
]
},
"keysExamined" : 3,
"seeks" : 2,
"dupsTested" : 0,
"dupsDropped" : 0,
"seenInvalidated" : 0
}
}
},
"serverInfo" : {
"host" : "iZbp1g11g0cdeeq9ht9fhjZ",
"port" : 27017,
"version" : "3.4.12",
"gitVersion" : "bfde702b19c1baad532ed183a871c12630c1bbba"
},
"ok" : 1
}

 正常查询

> db.test.find({ c: { $gte: 100, $lte: 200 },b:"food"}).explain("executionStats")
{
"queryPlanner" : {
"plannerVersion" : 1,
"namespace" : "test.test",
"indexFilterSet" : false,
"parsedQuery" : {
"$and" : [
{
"b" : {
"$eq" : "food"
}
},
{
"c" : {
"$lte" : 200
}
},
{
"c" : {
"$gte" : 100
}
}
]
},
"winningPlan" : {
"stage" : "FETCH",
"inputStage" : {
"stage" : "IXSCAN",
"keyPattern" : {
"b" : 1,
"c" : 1
},
"indexName" : "b_1_c_1",
"isMultiKey" : false,
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 1,
"direction" : "forward",
"indexBounds" : {
"b" : [
"[\"food\", \"food\"]"
],
"c" : [
"[100.0, 200.0]"
]
}
}
},
"rejectedPlans" : [
{
"stage" : "FETCH",
"inputStage" : {
"stage" : "IXSCAN",
"keyPattern" : {
"c" : 1,
"b" : 1
},
"indexName" : "c_1_b_1",
"isMultiKey" : false,
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 1,
"direction" : "forward",
"indexBounds" : {
"c" : [
"[100.0, 200.0]"
],
"b" : [
"[\"food\", \"food\"]"
]
}
}
},
{
"stage" : "FETCH",
"filter" : {
"b" : {
"$eq" : "food"
}
},
"inputStage" : {
"stage" : "IXSCAN",
"keyPattern" : {
"c" : 1
},
"indexName" : "c_1",
"isMultiKey" : false,
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 1,
"direction" : "forward",
"indexBounds" : {
"c" : [
"[100.0, 200.0]"
]
}
}
}
]
},
"executionStats" : {
"executionSuccess" : true,
"nReturned" : 1,
"executionTimeMillis" : 0,
"totalKeysExamined" : 1,
"totalDocsExamined" : 1,
"executionStages" : {
"stage" : "FETCH",
"nReturned" : 1,
"executionTimeMillisEstimate" : 0,
"works" : 3,
"advanced" : 1,
"needTime" : 0,
"needYield" : 0,
"saveState" : 0,
"restoreState" : 0,
"isEOF" : 1,
"invalidates" : 0,
"docsExamined" : 1,
"alreadyHasObj" : 0,
"inputStage" : {
"stage" : "IXSCAN",
"nReturned" : 1,
"executionTimeMillisEstimate" : 0,
"works" : 2,
"advanced" : 1,
"needTime" : 0,
"needYield" : 0,
"saveState" : 0,
"restoreState" : 0,
"isEOF" : 1,
"invalidates" : 0,
"keyPattern" : {
"b" : 1,
"c" : 1
},
"indexName" : "b_1_c_1",
"isMultiKey" : false,
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 1,
"direction" : "forward",
"indexBounds" : {
"b" : [
"[\"food\", \"food\"]"
],
"c" : [
"[100.0, 200.0]"
]
},
"keysExamined" : 1,
"seeks" : 1,
"dupsTested" : 0,
"dupsDropped" : 0,
"seenInvalidated" : 0
}
}
},
"serverInfo" : {
"host" : "iZbp1g11g0cdeeq9ht9fhjZ",
"port" : 27017,
"version" : "3.4.12",
"gitVersion" : "bfde702b19c1baad532ed183a871c12630c1bbba"
},
"ok" : 1
}

主要对比的还是:

"totalKeysExamined" : 3,
"totalDocsExamined" : 1,

"totalKeysExamined" : 1,
"totalDocsExamined" : 1,

是不是比较有意思,有时候monogdb并不会,走你想要的索引,当你创建多个联合索引的时候,情况就比较明显了

MongoDB中的explain和hint提的使用的更多相关文章

  1. mongodb之使用explain和hint性能分析和优化

    当你第一眼看到explain和hint的时候,第一个反应就是mysql中所谓的这两个关键词,确实可以看出,这个就是在mysql中借鉴过来的,既然是借鉴 过来的,我想大家都知道这两个关键字的用处,话不多 ...

  2. MongoDB的学习--explain()和hint()

    Explain 从之前的文章中,我们可以知道explain()能够提供大量与查询相关的信息.对于速度比较慢的查询来说,这是最重要的诊断工具之一.通过查看一个查询的explain()输出信息,可以知道查 ...

  3. 在MongoDB中执行查询、创建索引

    1. MongoDB中数据查询的方法 (1)find函数的使用: (2)条件操作符: (3)distinct找出给定键所有不同的值: (4)group分组: (5)游标: (6)存储过程. 文档查找 ...

  4. mongodb中的排序和索引快速学习

    在mongodb中,排序和索引其实都是十分容易的,先来小结下排序: 1 先插入些数据    db.SortTest.insert( { name : "Denis", age : ...

  5. MongoDB中聚合工具Aggregate等的介绍与使用

    Aggregate是MongoDB提供的众多工具中的比较重要的一个,类似于SQL语句中的GROUP BY.聚合工具可以让开发人员直接使用MongoDB原生的命令操作数据库中的数据,并且按照要求进行聚合 ...

  6. MongoDB中的聚合操作

    根据MongoDB的文档描述,在MongoDB的聚合操作中,有以下五个聚合命令. 其中,count.distinct和group会提供很基本的功能,至于其他的高级聚合功能(sum.average.ma ...

  7. MongoDB 大数据技术之mongodb中在嵌套子文档的文档上面建立索引

    一.给collection objectid赋自定义的值 MongoDB Enterprise > db.testid.insert({_id:{imsi:"4567890123&qu ...

  8. MongoDB 索引 和 explain 的使用

    索引基本使用 索引是对数据库表中一列或多列的值进行排序的一种结构,可以让我们查询数据库变得 更快.MongoDB 的索引几乎与传统的关系型数据库一模一样,这其中也包括一些基本的查 询优化技巧. 首先我 ...

  9. mongodb中帮助信息和命令

    在Mongodb中,可以看作是一种面向对象的操作,如果你对与某一个操作不清楚,可以直接help. 在mongodb中,无非是对DB.user.collections.文档的操作. 下面是简单的示例: ...

随机推荐

  1. 机械键盘那些事[我用过的minila Filco cherry 3494 阿米洛87]

    用过几月下来.最终现在还能流畅使用的,就剩下3494 跟 minila了. 想起购买的初衷.cherry是泰斗,红轴轻柔,所以三把全红轴. 之后,觉得携带外出不方便,所以就又入了个MINILA. 再后 ...

  2. source insight技巧

    (1)在Source Insight中能不能设置永久Bookmark 可以从macro方面入手 (2)source insight中添加.S文件 (3)source insight里面怎么能不让它每次 ...

  3. C++描述基础算法之直接插入排序

    由于此博文并不难,所以并不需要搬出C++特性的这些大山,所以就使用简单的C++代码描述了.^_^ 直接插入排序是一种简单的插入排序法,所以适用于少量数据的排序,直接插入排序是比较稳定的一种排序算法. ...

  4. Android线程计时器实现

    cocos2dx的计时器很好用,但当app进入后台,其计时器会pause掉,如果想要一个稳恒计时器就得自己去实现完成了,在Cocos2d-x for ios中我们可以利用NSTimer类并结合objc ...

  5. cv:显示Linux命令运行进度

    cv: 显示 cp.mv 等命令的进度 2014-07-14 By toy Posted in Apps Edit on GitHub 在 Linux 系统中 , 大多数命令从来都是信奉 “ 沉默是金 ...

  6. android sql Cursor

    Cursor 是每行的集合. 使用 moveToFirst() 定位第一行. 你必须知道每一列的名称.你必须知道每一列的数据类型.Cursor 是一个随机的数据源. 所有的数据都是通过下标取得. Cu ...

  7. mysql数据与Hadoop之间导入导出之Sqoop实例

    前面介绍了sqoop1.4.6的 如何将mysql数据导入Hadoop之Sqoop安装,下面就介绍两者间的数据互通的简单使用命令. 显示mysql数据库的信息,一般sqoop安装测试用 sqoop l ...

  8. JavaScript实现弹窗报错

    JavaScript实现弹窗报错 1.具体错误如下 SCRIPT 5022:cannot call methods on dialog prior to initialization; attempt ...

  9. laravel之构造器操作数据库

    使用构造器来查询的优点是可以方式sql注入 1.插入 2.修改数据库 3.删除 4.查询

  10. spring cloud心跳检测自我保护(EMERGENCY! EUREKA MAY BE INCORRECTLY CLAIMING INSTANCES ARE UP WHEN THEY'RE NOT. RENEWALS ARE LESSER THAN THRESHOLD AND HENCE THE INSTANCES ARE NOT BEING EXPIRED JUST TO BE SAFE.)

    EMERGENCY! EUREKA MAY BE INCORRECTLY CLAIMING INSTANCES ARE UP WHEN THEY'RE NOT. RENEWALS ARE LESSER ...