sqlite中fts的数据结构说明:segment leaf nodes_sql

概述注释文件的说明， **** Segment leaf nodes **** ** Segment leaf nodes store terms and doclists, ordered by term. Leaf ** nodes are written using LeafWriter, and read using LeafReader (to ** iterate through a s

注释文件的说明，

**** Segment leaf nodes ****
** Segment leaf nodes store terms and docLists,ordered by term. Leaf
** nodes are written using LeafWriter,and read using LeafReader (to
** iterate through a single leaf node's data) and LeavesReader (to
** iterate through a segment's entire leaf layer). Leaf nodes have
** the format:
**
** varint iHeight; (height from leaf level,always 0)
** varint nTerm; (length of first term)
** char pTerm[nTerm]; (content of first term)
** varint nDocList; (length of term's associated docList)
** char pDocList[nDocList]; (content of docList)
** array {
** (further terms are delta-encoded)
** varint nPrefix; (length of prefix shared with prevIoUs term)
** varint nSuffix; (length of unshared suffix)
** char pTermSuffix[nSuffix];(unshared suffix of next term)
** varint nDocList; (length of term's associated docList)
** char pDocList[nDocList]; (content of docList)
** }
**

一个node描述了一些term和其相对应的docList（这个结构的细节参考上一篇文章），基本上就是，term1+docList1+term2+docList2+term3+docList3.....。

第一字节开始，为一个变长的int型数值，表示当前node在b-tree的高度。在b-tree的高度定义中，树的最底层，也就是叶子节点，定义为level 0.由于这个nodes是leaf node，所以它的height总是0.

接下来字节也是一个变长int型数值，表示第一个term有多长，接下来就是一个char数组，存储了term这个字符串的具体内容。（存储term其实就是存个字符串，一般来说，我们可以这么存：顺序写字符流，最后写个0，表示结束。但是这里没这样做，而是先存个字符串的长度，再依次存字符流。）

再接下来也是一个变长的int数值，表示docList的字节流有多长，随后就是这么多个的字节流，表示docList（docList的具体解析可以参考上一篇文章）。

//-----

再往下就是存储下一个term和其对应的docList。我们知道term是按字符串大小排过序的，所以相邻的2个term的前缀字符总是相同。存储当前term的时候，先存个数值，表示当前term的前缀有多少个字符和上一个term相同，再存个数值，表示当前term去掉前缀还有多少个字符（也就是后缀）。接着就是当前term的后缀字符串。把上一个term的前缀加上当前term的后缀，就是当前term的具体内容。

再后来基本一样，先存个变长int数值，表示docList的长度，再存docList的具体内容。

从第二term开始，采用的存储方式，一来可以节省很多数据空间，排过序的term，前缀相同的比例非常的高，二来代码上看，也不会有任何的性能问题，就是代码的处理还是很流畅，没有啥来回判断，顺序读取term的时候一气呵成。

看代码：

pReader->nTerm，为上一个term的长度

pReader->zTerm[ ]，这个数组为上一个term的具体内容

pNext已经指向了当前term的内容的首地址。

为了读取当前的term，

pNext += sqlite3Fts3Getvarint32(pNext,&nPrefix); ///先读取prefix的长度
pNext += sqlite3Fts3Getvarint32(pNext,&nSuffix); ///再读取suffix的长度

memcpy(&pReader->zTerm[nPrefix],pNext,nSuffix); /// zTerm已经是上一个term的内容，从perfix下标开始的地方，把当前term的suffix拷贝过来
pReader->nTerm = nPrefix+nSuffix; ///设置当前term的长度

完成！

总结

以上是内存溢出为你收集整理的sqlite中fts的数据结构说明:segment leaf nodes全部内容，希望文章能够帮你解决sqlite中fts的数据结构说明:segment leaf nodes所遇到的程序开发问题。

如果觉得内存溢出网站内容还不错，欢迎将内存溢出网站推荐给程序员好友。

欢迎分享，转载请注明来源：内存溢出

原文地址:https://54852.com/sjk/1173977.html