这篇随笔是对Wikipedia上k-d tree词条的摘录, 我认为解释得相当生动详细, 是一篇不可多得的好文.


Overview

A \(k\)-d tree (short for \(k\)-dimensional tree) is a binary space-partitioning tree for organizing points in a \(k\)-dimensional space. \(k\)-d trees are a useful data structure for searches involving a multidimensional search key.

Construction

The canonical method of \(k\)-d tree construction has the following constraints:

  • As one moves down the tree, one cycles through the axes used to select the splitting planes.
  • Points are inserted by selecting the median of the points being put into the subtree, with respect to their coordinates in the axis being used to create the splitting plane.

This method leads to a balanced \(k\)-d tree, in which each leaf node is approximately the same distance from the root. However, balanced trees are not necessarily optimal for all applications.

Nearest Neighboring Search

Terms:

  • the split dimensions
  • the splitting (hyper)plane
  • "current best"

The nearest neighbor (NN) search algorithm aims to find the point in the tree that is nearest to a given point. This search can be done efficiently by using the tree properties to quickly eliminate large portions of the search space.

Searching for a nearest neighbor in a \(k\)-d tree proceeds as follows:

  1. Starting with the root node, the algorithm moves down the tree recursively.
  2. Once the algorithm reaches a leaf node, it saves that node point as "current best"
  3. The algorithm unwinds the recursion of the tree, performing the following steps at each node:
  4. If the current node is closer than the current best, then it becomes the current best.
  5. The algorithm checks whether there could be any points on the other side of the splitting plane that are closer to the search point than the current best. In concept, this is done by intersecting the splitting hyperplane with a hypersphere around the the search point that has a radius equal to the current nearest distance. Since the hyperplanes are all axis-aligned this is implemented as a simple comparison to see whether the distance between the splitting coordinate of the search point and current node is less than the distance (overall coordinates) from the search point to the current best.
    1. If the hypersphere crosses the plane, there could be nearer points on the other side of the plane, so the algorithm must move down the other branch of the tree from the current node looking for closer points, following the same recursive process as the entire search.
    2. If the hypersphere doesn't intersect the splitting plane, then the algorithm continues walking up the tree, and the entire branch on the other side of that node is eliminated.

Generally the algorithm uses squared distances for comparison to avoid computing square roots. Additionally, it can save computation by holding the squared current best distance in a variable for computation.

The algorithm can be extended in several ways by simple modifications. If can provide the $k $ nearest neighbors to a point by maintaining \(k\) current bests instead of just one. A branch is only eliminated when \(k\) points have been found and the branch cannot have points closer than any of the \(k\) current bests.


Implementation

\(k\)近临 (\(k\)NN)

#include <bits/stdc++.h>
#define lson id<<1
#define rson id<<1|1
#define sqr(x) (x)*(x)
using namespace std;
using LL=long long;
const int N=5e4+5;

// K-D tree: a special case of binary space partitioning trees
int DIM, idx;
struct Node{
    int key[5];
    bool operator<(const Node &rhs)const{
        return key[idx]<rhs.key[idx];
    }
    void read(){
        for(int i=0; i<DIM; i++)
            scanf("%d", key+i);
    }
    LL dis2(const Node &rhs)const{
        LL res=0;
        for(int i=0; i<DIM; i++)
            res+=sqr(key[i]-rhs.key[i]);
        return res;
    }
    void out(){
        for(int i=0; i<DIM; i++)
            printf("%d%c", key[i], i==DIM-1?'\n':' ');
    }
}p[N];

Node a[N<<2];   // K-D tree
bool f[N<<2];

// [l, r)
void build(int id, int l, int r, int dep){
    if(l==r) return;    // error-prone
    f[id]=true, f[lson]=f[rson]=false;
    // select axis based on depth so that axis cycles through all valid values
    idx=dep%DIM;
    int mid=l+r>>1;
    // sort point list and choose median as pivot element
    nth_element(p+l, p+mid, p+r);
    a[id]=p[mid];
    build(lson, l, mid, dep+1);
    build(rson, mid+1, r, dep+1);
}

using P=pair<LL,Node>;
priority_queue<P> que;
// multidimensional search key

void query(const Node &p, int id, int m, int dep){
    int dim=dep%DIM;
    int x=lson, y=rson;
    // left: <, right >=
    if(p.key[dim]>=a[id].key[dim])
        swap(x, y);

    if(f[x]) query(p, x, m, dep+1);

    P cur{p.dis2(a[id]), a[id]};

    if(que.size()<m){
        que.push(cur);
    }
    else if(cur.first<que.top().first){
        que.pop();
        que.push(cur);
    }
    if(f[y] && sqr(a[id].key[dim]-p.key[dim])<que.top().first)
        query(p, y, m, dep+1);
}

说明:

  1. bool数组f[], 表示一个完全二叉树中的某个节点是否存在, 也可不用完全二叉树的表示法, 而用两个数组lson[]rson[]表示, 这样的好处还有: 节省空间, 数组可以只开到节点数的2倍.
  2. 区间采用左闭右开表示.

K-D Tree的更多相关文章

  1. AOJ DSL_2_C Range Search (kD Tree)

    Range Search (kD Tree) The range search problem consists of a set of attributed records S to determi ...

  2. Size Balance Tree(SBT模板整理)

    /* * tree[x].left 表示以 x 为节点的左儿子 * tree[x].right 表示以 x 为节点的右儿子 * tree[x].size 表示以 x 为根的节点的个数(大小) */ s ...

  3. HDU3333 Turing Tree(线段树)

    题目 Source http://acm.hdu.edu.cn/showproblem.php?pid=3333 Description After inventing Turing Tree, 3x ...

  4. POJ 3321 Apple Tree(树状数组)

                                                              Apple Tree Time Limit: 2000MS   Memory Lim ...

  5. CF 161D Distance in Tree 树形DP

    一棵树,边长都是1,问这棵树有多少点对的距离刚好为k 令tree(i)表示以i为根的子树 dp[i][j][1]:在tree(i)中,经过节点i,长度为j,其中一个端点为i的路径的个数dp[i][j] ...

  6. Segment Tree 扫描线 分类: ACM TYPE 2014-08-29 13:08 89人阅读 评论(0) 收藏

    #include<iostream> #include<cstdio> #include<algorithm> #define Max 1005 using nam ...

  7. Bzoj 2588: Spoj 10628. Count on a tree 主席树,离散化,可持久,倍增LCA

    题目:http://www.lydsy.com/JudgeOnline/problem.php?id=2588 2588: Spoj 10628. Count on a tree Time Limit ...

  8. Size Balanced Tree(SBT) 模板

    首先是从二叉搜索树开始,一棵二叉搜索树的定义是: 1.这是一棵二叉树: 2.令x为二叉树中某个结点上表示的值,那么其左子树上所有结点的值都要不大于x,其右子树上所有结点的值都要不小于x. 由二叉搜索树 ...

  9. hdu 5274 Dylans loves tree(LCA + 线段树)

    Dylans loves tree Time Limit: 2000/1000 MS (Java/Others)    Memory Limit: 131072/131072 K (Java/Othe ...

随机推荐

  1. Android开发之画图的实现

    Android开发之画图的实现    四天前上完安卓的第一节课,真的是一脸懵逼,尽管熊哥说和java是差不多的,然而这个包和那个包之间的那些转换都是些什么鬼呀!!!但是四天的学习和操作下来,我觉得安卓 ...

  2. flash 居中问题

    如果舞台是1000的宽度,要剧中比较容易 mc1.x = (1000-400)/2; 这样就居中了,来看原理,首先我们要舞台居中,很容易就想到一个数字 1000/2 结果是500 但是x对舞台的中央是 ...

  3. Maven安装最佳实践(Windows平台)

    第一步:下载maven,解压缩. 在maven官网下载maven文件,这里我下载的是"apache-maven-2.2.1-bin.zip",如果需要maven的源代码,可以选择下 ...

  4. 基于吉日嘎底层架构的Web端权限管理操作演示-菜单模块管理

    按照顺序,这一篇介绍菜单模块管理,主要演示如下操作: 新增.修改.锁定.解锁.删除.撤销删除 排序 角色成员管理 用户成员管理 导出菜单模块数据 也许你会问,你在这自吹自擂,你这个BS的权限管理有啥缺 ...

  5. 使用webpack搭建vue开发环境

    最近几天项目上使用了vue.js作为一个主要的开发框架,并且为了发布的方便搭配了webpack一起使用.CSS框架使用的是vue-strap(vue 对bootstrap控件做了封装)这篇文章主要总结 ...

  6. Linux服务器管理: 日志管理(一)

    1.日志管理介绍: a.日志服务:在CentOS6.x中日志服务以及由rsyslogd取代了原有的syslogd服务.rsyslogd日志服务更加先进,功能更多.但是不论该服务的使用,还是日子文件的格 ...

  7. Javascript备忘

    js输出对象类型: Object.prototype.toString.apply(s) 设置单行点击效果: obj.style.background = "#efefef";se ...

  8. Web交互设计优化的简易check list

    Web交互设计优化的简易check list 00 | 时间: 2011-02-11 | 28,842 Views 交互设计, 用户研究   “优化已有产品的体验”,这是用户体验相关岗位职责中常见的描 ...

  9. IE中的fireEvent和webkit中的dispatchEvent

    拿浏览器的click事件来说: 在IE浏览器中如果一个element没有注册click事件,那么直接调用的话会出现异常!当然如果你注册了没有什么可说的. 那么如果使用fireEvent来处理,clic ...

  10. tarjan算法 POJ3177-Redundant Paths

    参考资料传送门 http://blog.csdn.net/lyy289065406/article/details/6762370 http://blog.csdn.net/lyy289065406/ ...