Pipal密码分析器:深入解析密码泄露数据

Pipal是一款专业的密码分析工具,专门用于分析大规模泄露的密码数据。它能够统计密码长度分布、字符集使用情况、常见基础词、日期模式等关键指标,并支持Hashcat掩码生成,帮助安全研究人员深入理解密码行为模式。

Pipal密码分析器

在大多数内部渗透测试中,我通常能够从域控制器获取密码转储。为了对这些数据进行基本分析,我编写了Counter工具,自最初发布以来,我已对其进行了多次修改,以生成在向管理层报告时有用的额外统计信息。

最近,一位好友n00bz在Twitter上询问是否有人拥有可以分析密码的工具。我向他推荐了Counter,并表示如果有任何添加建议请告诉我。他确实提出了建议,在过去的一个月里,我们共同开发了大量新功能,我们认为这些功能将帮助任何需要分析大量破解密码的人。我们还得到了知名密码分析师Matt Weir和Martin Bos的宝贵意见,在此表示衷心感谢。

我必须指出,这个工具只是为您提供统计信息和帮助分析密码的数据。真正的工作在于您对结果的解读,我提供数字,您讲述故事。

由于底层代码发生了如此多的变化,我还决定更改名称(了解原因)并进行全新的发布。

模块化发布

在过去的几个月里,我一直在重写Pipal,使其模块化,而不是一个庞大的单体块。与其在此处尝试添加额外信息,我写了一篇简短的博客文章来介绍它。

Pipal走向模块化

版本2

版本2 - 两大变化,首先是速度的大幅提升。这个补丁由Stefan Venken提交,他说简单提及就足够了,但我想特别感谢他。在版本1上处理LinkedIn列表需要很多小时,而版本2在大约15分钟内处理了350万条记录。谢谢。

第二个变化是添加了美国区号和邮政编码查询。这个小功能在运行源自美国的密码列表时提供了一些有趣的地理数据。我见过的最好例子是Military Singles网站的转储,其中一些密码明显围绕美国军事基地分组。英国人对电话号码没有相同的关系,所以我知道这在这里行不通,但如果有人能建议其他可能有用的领域,那么我将考虑构建某种位置感知功能,以便您可以指定列表的来源并获得针对正确区域定制的结果,或者只是运行每个区域并查看是否出现模式。

版本2的一个非代码库变化是将代码托管从我自己转移到github。这是我第一个github托管的项目,所以我可能会出错,如果出错,抱歉。许多人询问如何提交补丁,所以这似乎是最好的方式,希望它能成功。更多信息请参见下载部分。

工作示例

那么,Pipal是做什么的?解释这一点的最简单方法是展示解析泄露密码列表生成的输出。我选择了从SkullSecurity站点获取的phpBB泄露密码列表。

第一个输出是文件中解析的条目数和找到的唯一条目数。不幸的是,我选择的列表已经通过唯一性处理,所以在这个例子中这两个数字匹配。

1
2
Total entries = 184373
Total unique entries = 184373

前10个密码。在这种情况下,我选择的列表已经通过过滤器去除了任何重复项,这就是为什么每个单词只出现一次。显示前10个的上限可以通过命令行参数配置,我建议尝试这个限制,因为有时下一个条目是开始解释事情的条目。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
Top 10 passwords
123456 = 1 (0.0%)
password = 1 (0.0%)
phpbb = 1 (0.0%)
qwerty = 1 (0.0%)
12345 = 1 (0.0%)
12345678 = 1 (0.0%)
letmein = 1 (0.0%)
111111 = 1 (0.0%)
1234 = 1 (0.0%)
123456789 = 1 (0.0%)

下一个列表是基础词的数量。我将基础词定义为从开头和结尾去除任何非字母字符的单词。这对于识别密码所基于的常见单词(如公司名称或地点)很有用。我确实考虑过去除所有非字母字符,但在测试的一个列表中,我找到了基础词"un1c0rn"。保留单词中的非字母字符是有意义的,去除它们会得到"uncrn",这并没有真正的意义。

不出所料,由于这个列表来自phpBB,密码基于的最高词是"phpbb",下一个是另一个明显的基础词"password",但"dragon"是我没有预料到的。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
Top 10 base words
phpbb = 332 (0.18%)
password = 89 (0.05%)
dragon = 76 (0.04%)
pass = 70 (0.04%)
mike = 69 (0.04%)
blue = 67 (0.04%)
test = 66 (0.04%)
qwerty = 59 (0.03%)
alex = 58 (0.03%)
alpha = 53 (0.03%)

接下来是长度,相当自解释。遗憾的是,那些付出努力并拥有超过20个字符密码的人仍然泄露了密码。

我希望948个三个及以下字符的单词是在破解列表时犯的错误。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
Password length (length ordered)
1 = 33 (0.02%)
2 = 138 (0.07%)
3 = 777 (0.42%)
4 = 4597 (2.49%)
5 = 8199 (4.45%)
6 = 42069 (22.82%)
7 = 32731 (17.75%)
8 = 55338 (30.01%)
9 = 19187 (10.41%)
10 = 11897 (6.45%)
11 = 4934 (2.68%)
12 = 2506 (1.36%)
13 = 1019 (0.55%)
14 = 516 (0.28%)
15 = 233 (0.13%)
16 = 126 (0.07%)
17 = 37 (0.02%)
18 = 28 (0.02%)
19 = 10 (0.01%)
20 = 9 (0.0%)
21 = 6 (0.0%)
22 = 3 (0.0%)
23 = 4 (0.0%)
25 = 2 (0.0%)
27 = 3 (0.0%)
28 = 2 (0.0%)
32 = 4 (0.0%)
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
Password length (count ordered)
8 = 55338 (30.01%)
6 = 42069 (22.82%)
7 = 32731 (17.75%)
9 = 19187 (10.41%)
10 = 11897 (6.45%)
5 = 8199 (4.45%)
11 = 4934 (2.68%)
4 = 4597 (2.49%)
12 = 2506 (1.36%)
13 = 1019 (0.55%)
3 = 777 (0.42%)
14 = 516 (0.28%)
15 = 233 (0.13%)
2 = 138 (0.07%)
16 = 126 (0.07%)
17 = 37 (0.02%)
1 = 33 (0.02%)
18 = 28 (0.02%)
19 = 10 (0.01%)
20 = 9 (0.0%)
21 = 6 (0.0%)
23 = 4 (0.0%)
32 = 4 (0.0%)
22 = 3 (0.0%)
27 = 3 (0.0%)
25 = 2 (0.0%)
28 = 2 (0.0%)

接下来是一个漂亮的图表显示长度数据,我很自豪能加入这个。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
        |                                                               
        |                                                               
        |                                                               
      | |                                                               
      | |                                                               
      | |                                                               
      |||                                                               
      |||                                                               
      |||                                                               
      |||                                                               
      ||||                                                              
      ||||                                                              
      |||||                                                             
     ||||||                                                             
    ||||||||                                                            
|||||||||||||||||||||||||||||||||                                       
000000000011111111112222222222333
012345678901234567890123456789012

接下来是一些更自解释的信息。30%的人选择了1-6个字符的密码,40%的人选择了仅包含小写字母字符的密码。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
One to six characters = 55807 (30.27%)
One to eight characters = 143874 (78.03%)
More than eight characters = 40507 (21.97%)

Only lowercase alpha = 76041 (41.24%)
Only uppercase alpha = 1706 (0.93%)
Only alpha = 77747 (42.17%)
Only numeric = 20728 (11.24%)

First capital last symbol = 225 (0.12%)
First capital last number = 4749 (2.58%)

外部列表是通过命令行传递给Pipal的单词列表。我检查每个密码中包含这些单词的次数。这类似于基础词,但在这里您告诉应用程序要搜索哪些基础词。

如果您想知道为什么"dragon"作为基础词只被计数76次,但在这里显示185次,那是因为有109个基础词包含"dragon"但不只是"dragon",例如"phpdragon"。

我使用的外部列表是声称是"互联网上25个最差密码"的列表。另一个建议使用的单词列表是Alexa前1000列表中的域名,如果您正在分析来自未知来源的密码列表,或者想知道一个域名的列表是否与其他域名相关联,这可能很有用。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
External list (Top 10)
master = 229 (0.12%)
123456 = 208 (0.11%)
dragon = 185 (0.1%)
password = 164 (0.09%)
monkey = 118 (0.06%)
shadow = 105 (0.06%)
qwerty = 95 (0.05%)
1234567 = 72 (0.04%)
12345678 = 47 (0.03%)
letmein = 44 (0.02%)

我们现在查看完整和缩写形式的月份和日期。虽然"may"可能是一个人的名字或普通单词,但出于某种原因,它似乎是列表中的一个流行单词。“June"和"April"也很流行,但也是名字,这可以解释较高的比例。对于星期几,非常偏爱"monday"和"friday”,猜猜人们在哪几天更改密码。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
Months
january = 8 (0.0%)
february = 3 (0.0%)
march = 23 (0.01%)
april = 48 (0.03%)
may = 171 (0.09%)
june = 56 (0.03%)
july = 27 (0.01%)
august = 22 (0.01%)
september = 3 (0.0%)
october = 15 (0.01%)
november = 7 (0.0%)
december = 6 (0.0%)

Days
monday = 12 (0.01%)
tuesday = 2 (0.0%)
wednesday = 1 (0.0%)
thursday = 3 (0.0%)
friday = 11 (0.01%)
saturday = 1 (0.0%)
sunday = 5 (0.0%)
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
Months (Abreviated)
jan = 341 (0.18%)
feb = 42 (0.02%)
mar = 1406 (0.76%)
apr = 108 (0.06%)
may = 171 (0.09%)
jun = 190 (0.1%)
jul = 158 (0.09%)
aug = 83 (0.05%)
sept = 17 (0.01%)
oct = 69 (0.04%)
nov = 161 (0.09%)
dec = 120 (0.07%)

Days (Abreviated)
mon = 953 (0.52%)
tues = 3 (0.0%)
wed = 69 (0.04%)
thurs = 6 (0.0%)
fri = 169 (0.09%)
sat = 187 (0.1%)
sun = 299 (0.16%)

既然我们已经查看了月份和日期,为什么不查看年份。看起来千禧年左右的年份在这个列表中很受欢迎。我还在myspace泄露的密码上运行了这个,显示1990年左右的年份很受欢迎,也许这说明了普通用户的年龄。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
Includes years
1975 = 82 (0.04%)
1976 = 80 (0.04%)
1977 = 96 (0.05%)
1978 = 118 (0.06%)
1979 = 142 (0.08%)
1980 = 130 (0.07%)
1981 = 139 (0.08%)
1982 = 142 (0.08%)
1983 = 168 (0.09%)
1984 = 176 (0.1%)
1985 = 171 (0.09%)
1986 = 152 (0.08%)
1987 = 183 (0.1%)
1988 = 165 (0.09%)
1989 = 139 (0.08%)
1990 = 127 (0.07%)
1991 = 115 (0.06%)
1992 = 82 (0.04%)
1993 = 49 (0.03%)
1994 = 41 (0.02%)
1995 = 25 (0.01%)
1996 = 38 (0.02%)
1997 = 56 (0.03%)
1998 = 49 (0.03%)
1999 = 79 (0.04%)
2000 = 428 (0.23%)
2001 = 236 (0.13%)
2002 = 268 (0.15%)
2003 = 235 (0.13%)
2004 = 180 (0.1%)
2005 = 199 (0.11%)
2006 = 145 (0.08%)
2007 = 91 (0.05%)
2008 = 30 (0.02%)
2009 = 26 (0.01%)
2010 = 57 (0.03%)
2011 = 48 (0.03%)
2012 = 45 (0.02%)
2013 = 27 (0.01%)
2014 = 9 (0.0%)
2015 = 16 (0.01%)
2016 = 12 (0.01%)
2017 = 17 (0.01%)
2018 = 16 (0.01%)
2019 = 26 (0.01%)
2020 = 47 (0.03%)
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
Years (Top 10)
2000 = 428 (0.23%)
2002 = 268 (0.15%)
2001 = 236 (0.13%)
2003 = 235 (0.13%)
2005 = 199 (0.11%)
1987 = 183 (0.1%)
2004 = 180 (0.1%)
1984 = 176 (0.1%)
1985 = 171 (0.09%)
1983 = 168 (0.09%)

常见的假设是,当人们被迫使用带数字的密码时,他们的一般反应是在末尾添加一个数字。查看下一组统计数据,在这个列表中,人们实际上更喜欢在末尾添加两个数字。然而,最后一个数字是"1"的假设确实成立。

1
2
3
Single digit on the end = 14447 (7.84%)
Two digits on the end = 18112 (9.82%)
Three digits on the end = 9637 (5.23%)
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
Last number
0 = 7753 (4.2%)
1 = 13572 (7.36%)
2 = 8735 (4.74%)
3 = 9313 (5.05%)
4 = 6279 (3.41%)
5 = 6408 (3.48%)
6 = 5991 (3.25%)
7 = 6472 (3.51%)
8 = 5726 (3.11%)
9 = 6728 (3.65%)
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
 |                                                                      
 |                                                                      
 |                                                                      
 |
 |
 |||
||||
||||
|||||||| |
||||||||||
||||||||||                                                              
||||||||||                                                              
||||||||||                                                              
||||||||||                                                              
||||||||||                                                              
||||||||||                                                              
0123456789

我们现在查看最后几位数字是什么。一些数字是预期的,但其他数字,例如21984,不是。这可能是美国邮政编码吗?

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
Last digit
1 = 13572 (7.36%)
3 = 9313 (5.05%)
2 = 8735 (4.74%)
0 = 7753 (4.2%)
9 = 6728 (3.65%)
7 = 6472 (3.51%)
5 = 6408 (3.48%)
4 = 6279 (3.41%)
6 = 5991 (3.25%)
8 = 5726 (3.11%)

Last 2 digits (Top 10)
23 = 3027 (1.64%)
00 = 2185 (1.19%)
01 = 1992 (1.08%)
12 = 1817 (0.99%)
11 = 1620 (0.88%)
99 = 1341 (0.73%)
21 = 1150 (0.62%)
13 = 1095 (0.59%)
69 = 1052 (0.57%)
88 = 1028 (0.56%)

Last 3 digits (Top 10)
123 = 2164 (1.17%)
000 = 708 (0.38%)
234 = 477 (0.26%)
007 = 449 (0.24%)
001 = 430 (0.23%)
666 = 397 (0.22%)
321 = 286 (0.16%)
101 = 284 (0.15%)
002 = 274 (0.15%)
111 = 261 (0.14%)
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
Last 4 digits (Top 10)
1234 = 424 (0.23%)
2000 = 377 (0.2%)
2002 = 215 (0.12%)
2003 = 202 (0.11%)
2001 = 181 (0.1%)
2005 = 166 (0.09%)
2004 = 153 (0.08%)
1987 = 141 (0.08%)
1988 = 133 (0.07%)
1985 = 132 (0.07%)

Last 5 digits (Top 10)
12345 = 110 (0.06%)
23456 = 68 (0.04%)
54321 = 25 (0.01%)
11111 = 23 (0.01%)
21984 = 21 (0.01%)
00000 = 18 (0.01%)
11988 = 16 (0.01%)
21985 = 15 (0.01%)
23123 = 14 (0.01%)
11984 = 13 (0.01%)

最后三个是Martin的建议。这些是我们开始从分析转向破解、字符集和Hashcat掩码

comments powered by Disqus
使用 Hugo 构建
主题 StackJimmy 设计