V2EX = way to explore
V2EX 是一个关于分享和探索的地方
现在注册
已注册用户请  登录
推荐学习书目
Learn Python the Hard Way
Python Sites
PyPI - Python Package Index
http://diveintopython.org/toc/index.html
Pocoo
值得关注的项目
PyPy
Celery
Jinja2
Read the Docs
gevent
pyenv
virtualenv
Stackless Python
Beautiful Soup
结巴中文分词
Green Unicorn
Sentry
Shovel
Pyflakes
pytest
Python 编程
pep8 Checker
Styles
PEP 8
Google Python Style Guide
Code Style from The Hitchhiker's Guide
TOUJOURSER
V2EX  ›  Python

请教一个正则匹配多行问题的

  •  
  •   TOUJOURSER · 2019-11-14 08:46:58 +08:00 · 2992 次点击
    这是一个创建于 1854 天前的主题,其中的信息可能已经有所发展或是发生改变。

    文本内容:

     network-object host 109.17.49.131
    object-group network KaGuan_12
     network-object host 109.17.26.11
     network-object host 109.17.26.12
    object-group network ShangHai
     network-object host 110.1.60.91
     network-object host 110.1.60.92
     network-object host 110.1.172.31
     network-object host 110.1.172.32
     network-object host 110.238.250.57
     network-object host 110.238.250.58
    object-group network wangguan_test
     network-object host 111.17.13.32
    object-group network wangguan_app_2
     network-object host 111.17.9.54
     network-object host 111.17.9.63
    

    我现在想通过 110.1.172.32 这个字符串 匹配到 object-group network 后面的 ShangHai 这个关键词, 请问正则怎么写? 感谢各位大佬

    12 条回复    2019-11-15 09:43:43 +08:00
    lululau
        1
    lululau  
       2019-11-14 08:52:05 +08:00 via iPhone
    正则正则,起码把规则说出来啊,你只举一个特例,那你直接把这个特例写成字符串字面量就可以了,不需要正则啊,再说,你这个说的 110.1.172.32 后面也没有 shanghai 啊
    TOUJOURSER
        2
    TOUJOURSER  
    OP
       2019-11-14 09:00:25 +08:00
    @lululau 不好意思描述的不太清楚, 规则是根据 ip 去获取该 ip 所属于的 group.
    我想通过 110.1.172.32 去向上查找以"object-group network "开头的那一行最后的 ShangHai 字符.
    araraloren
        3
    araraloren  
       2019-11-14 09:03:58 +08:00
    自己字符串直接查找不行么,如果是在练习正则的话当我没说。。
    ClericPy
        4
    ClericPy  
       2019-11-14 09:11:13 +08:00
    import re

    string = r'''network-object host 109.17.49.131
    object-group network KaGuan_12
    network-object host 109.17.26.11
    network-object host 109.17.26.12
    object-group network ShangHai
    network-object host 110.1.60.91
    network-object host 110.1.60.92
    network-object host 110.1.172.31
    network-object host 110.1.172.32
    network-object host 110.238.250.57
    network-object host 110.238.250.58
    object-group network wangguan_test
    network-object host 111.17.13.32
    object-group network wangguan_app_2
    network-object host 111.17.9.54
    network-object host 111.17.9.63'''
    ip = '110.1.172.32'.replace('.', r'\.')
    result = re.search(
    rf'object-group network (.*?)\n[\s\S]*?network-object host {ip}',
    string)
    if result:
    print(result.group(1))
    else:
    print('Not found.')


    KaGuan_12
    TOUJOURSER
        5
    TOUJOURSER  
    OP
       2019-11-14 09:16:49 +08:00
    @ClericPy 感谢, 你写的这个只可以匹配出 KaGuan_12 , 但是我需要的是 ShangHai, 110.1.172.32 是属于 ShangHai group
    todd7zhang
        6
    todd7zhang  
       2019-11-14 09:30:24 +08:00   ❤️ 2
    import re
    a = ''' network-object host 109.17.49.131
    object-group network KaGuan_12
    network-object host 109.17.26.11
    network-object host 109.17.26.12
    object-group network ShangHai
    network-object host 110.1.60.91
    network-object host 110.1.60.92
    network-object host 110.1.172.31
    network-object host 110.1.172.32
    network-object host 110.238.250.57
    network-object host 110.238.250.58
    object-group network wangguan_test
    network-object host 111.17.13.32
    object-group network wangguan_app_2
    network-object host 111.17.9.54
    network-object host 111.17.9.63
    '''

    re.findall(r'object-group network (\w+)(?:(?:(?!object-group).)*)(?=network-object host 110\.1\.172\.31)', a, re.DOTALL)
    ClericPy
        7
    ClericPy  
       2019-11-14 09:33:15 +08:00
    @TOUJOURSER #5 哦对了, 正则没法回溯取非贪婪的, 比较好的办法还是
    1. 逐行分析解析成 ip: group 映射再查询
    2. 或者是用 group 字段做 split 再查询
    ClericPy
        8
    ClericPy  
       2019-11-14 09:34:12 +08:00   ❤️ 1
    6 楼的零宽断言是个好方法, 先匹配非 group 再取 group
    TOUJOURSER
        9
    TOUJOURSER  
    OP
       2019-11-14 09:38:14 +08:00
    @ClericPy @todd7zhang 真的非常感谢, 这个问题困扰了我好久
    qingshengwen
        10
    qingshengwen  
       2019-11-14 12:07:55 +08:00
    试试这个看看
    object\-group\snetwork\s(.*?)(?:\s*network\-object\shost\s\d+\.\d+\.\d+\.\d+\s*){0,}(?:\s*network\-object\shost\s110.1.172.32\s*)
    https://imgur.com/a/loDyJo7
    no1xsyzy
        11
    no1xsyzy  
       2019-11-14 19:39:32 +08:00
    写个 FSM 不行吗?这还正则?(不超过十分钟写完)
    https://gist.github.com/no1xsyzy/11ba23cd13f90b3f94227d4838fd7e5a
    corningsun
        12
    corningsun  
       2019-11-15 09:43:43 +08:00
    用不上正则

    如果只需要一次匹配,只是两次截取字符串的问题,代码示例如下:

    https://gist.github.com/corningsun/87aac1ca2bc46e698a4d332a93118bbe

    如果需要多次匹配,且数据量不大的话,最好的还是解析所有 IP,做哈希表查找比较好。
    关于   ·   帮助文档   ·   博客   ·   API   ·   FAQ   ·   实用小工具   ·   2994 人在线   最高记录 6679   ·     Select Language
    创意工作者们的社区
    World is powered by solitude
    VERSION: 3.9.8.5 · 25ms · UTC 11:46 · PVG 19:46 · LAX 03:46 · JFK 06:46
    Developed with CodeLauncher
    ♥ Do have faith in what you're doing.