Update namespace.cc: re-implement the member function NormalizePath#778
Open
ToWorld wants to merge 5 commits intobaidu:masterfrom
Open
Update namespace.cc: re-implement the member function NormalizePath#778ToWorld wants to merge 5 commits intobaidu:masterfrom
ToWorld wants to merge 5 commits intobaidu:masterfrom
Conversation
Update namespace.cc: re-implement the member function NormalizePath, which increased efficiency by at least 10%
yvxiang
reviewed
Jan 9, 2017
src/nameserver/namespace.cc
Outdated
| slash = true; | ||
| } else { | ||
| slash = false; | ||
| if (path[i-1] == '/') { |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Update namespace.cc: re-implement the member function NormalizePath. There are some tests below.

算法说明:
`算法伪代码:
std::string NameSpace::NormalizePath(const std::string& path) {
// 存放标准化后的路径
std::string ret;
// 处理边界条件
if (path.empty() || path[0] != '/') {
ret = "/";
}
// 算法的主循环
// 每次连续处理两个字符
// 为什么选择连续处理两个字符,而不是一个,或者三个或者更多
// 这个主要和函数所要处理的问题相关,本函数主要是想将(用户)
// 输入的路径进行标准化,如:
// "home" --> "/home"
// "" --> "/"
// "///" --> "/"
// "home//baidu///" --> "/home/baidu"
// etc.
// 那么本算法的主要是想就是,每次处理连续两个字符,
// 并且当前索引index指向连续两个字符的后一个字符
// 那么此时index的值初始化为1
for (uint32_t index = 1; i < path.size(); ) {
// 若当前索引对应的字符不是'/', 那么我们可以完全将当前
// 连续的两个字符存储起来
if (path[index] != '/') {
ret.push_back(path[index-1]);
ret.push_back(path[index]);
// 此时index需要跳2段
index += 2;
} else {
// 当前索引对应的字符是'/',那么我们就需要知道,当前
// 字符的左边和右边各是什么字符,首先我们处理当前字符的
// 左边的字符,若不是'/', 存储,否则忽略,
// 而当前字符的后一个字符是什么,当前字符不能知道,则当前
// 字符先不处理,此时跳1段
if (path[index-1] != '/') {
ret.push_back(path[index-1]);
}
index++;
}
}
// 很容易知道,上面的主循环由于索引会跳2段,或者索引初始化为1,会导致源字符串中的
// 最后一个字符不会被处理,比如:
// "a", "/", "/aa", "///"
// 最后我们处理这种情况
// 这个时候只需要处理一个可能,判断最后一个字符是否为'/',不是则存储,是的话,则忽略。
// 但是这里会遗漏一个test,即"/",我们在后面还会加一个来弥补
if (index == len && path[index-1]!='/') {
ret.push_back(path[index-1]);
}
if (ret.empty()) {
ret = "/";
}
return ret;
}
`
下面说说为什么这个算法比每次处理一个字符效率要高,
若遇到后面一种极端的情况,那么本算法可能就和每次处理一个字符效率相当了。所以我们还是考虑正常的情况,这个时候
若非"/"字符所占的比例越高,每次处理两个字符的效率就越高,不过永远不会超越每次处理一个字符的50%,因为本算法每次
都是处理两个字符。
经过初步测试,效率的确提高不少。
欢迎大家提出更好的方法。