我试图确定重复格式化的文本文档中是否存在子字符串。我正在循环使用特定的关键字,并尝试识别其后的另一个单词。这两个单词总是用整数分隔,变量值不等。我基本上想要一种方法将子字符串中的整数表示为任何整数值,如果可能的话。像这样的东西:
substr = keyword +' '+ integer +' '+ word
teststr = "one two three keyword 24 word four five"
if substr in teststr:
print("substr exists in teststr")
或者,我可以做一个循环并检查迭代器:
for el in teststr():
checkstr = keyword +' '+ el.isdigit +' '+ word
if checkstr in teststr:
print("yes")
解决办法:
可以使用正则表达式捕获该模式。以下是您正在寻找的内容的快速实现:
import re
sample = "one two three keyword 24 word four five, another test is here pick 12 me"
# (\w+) is a group to include a word, followed by a number (\d+), then another word
pattern = r"(\w+).(\d+).(\w+)"
result = re.findall(pattern, sample)
if result:
print('yes')








暂无数据