如何做文本replace像python?

我想在这个文件中启用所有apt存储库

cat /etc/apt/sources.list ## Note, this file is written by cloud-init on first boot of an instance ## modifications made here will not survive a re-bundle. ## if you wish to make changes you can: ## a.) add 'apt_preserve_sources_list: true' to /etc/cloud/cloud.cfg ## or do the same in user-data ## b.) add sources in /etc/apt/sources.list.d # # See http://help.ubuntu.com/community/UpgradeNotes for how to upgrade to # newer versions of the distribution. deb http://us-east-1.ec2.archive.ubuntu.com/ubuntu/ maverick main deb-src http://us-east-1.ec2.archive.ubuntu.com/ubuntu/ maverick main ## Major bug fix updates produced after the final release of the ## distribution. deb http://us-east-1.ec2.archive.ubuntu.com/ubuntu/ maverick-updates main deb-src http://us-east-1.ec2.archive.ubuntu.com/ubuntu/ maverick-updates main ## NB software from this repository is ENTIRELY UNSUPPORTED by the Ubuntu ## team. Also, please note that software in universe WILL NOT receive any ## review or updates from the Ubuntu security team. deb http://us-east-1.ec2.archive.ubuntu.com/ubuntu/ maverick universe deb-src http://us-east-1.ec2.archive.ubuntu.com/ubuntu/ maverick universe deb http://us-east-1.ec2.archive.ubuntu.com/ubuntu/ maverick-updates universe deb-src http://us-east-1.ec2.archive.ubuntu.com/ubuntu/ maverick-updates universe ## NB software from this repository is ENTIRELY UNSUPPORTED by the Ubuntu ## team, and may not be under a free licence. Please satisfy yourself as to ## your rights to use the software. Also, please note that software in ## multiverse WILL NOT receive any review or updates from the Ubuntu ## security team. # deb http://us-east-1.ec2.archive.ubuntu.com/ubuntu/ maverick multiverse # deb-src http://us-east-1.ec2.archive.ubuntu.com/ubuntu/ maverick multiverse # deb http://us-east-1.ec2.archive.ubuntu.com/ubuntu/ maverick-updates multiverse # deb-src http://us-east-1.ec2.archive.ubuntu.com/ubuntu/ maverick-updates multiverse ## Uncomment the following two lines to add software from the 'backports' ## repository. ## NB software from this repository may not have been tested as ## extensively as that contained in the main release, although it includes ## newer versions of some applications which may provide useful features. ## Also, please note that software in backports WILL NOT receive any review ## or updates from the Ubuntu security team. # deb http://us-east-1.ec2.archive.ubuntu.com/ubuntu/ maverick-backports main restricted universe multiverse # deb-src http://us-east-1.ec2.archive.ubuntu.com/ubuntu/ maverick-backports main restricted universe multiverse ## Uncomment the following two lines to add software from Canonical's ## 'partner' repository. ## This software is not part of Ubuntu, but is offered by Canonical and the ## respective vendors as a service to Ubuntu users. # deb http://archive.canonical.com/ubuntu maverick partner # deb-src http://archive.canonical.com/ubuntu maverick partner deb http://security.ubuntu.com/ubuntu maverick-security main deb-src http://security.ubuntu.com/ubuntu maverick-security main deb http://security.ubuntu.com/ubuntu maverick-security universe deb-src http://security.ubuntu.com/ubuntu maverick-security universe # deb http://security.ubuntu.com/ubuntu maverick-security multiverse # deb-src http://security.ubuntu.com/ubuntu maverick-security multiverse 

使用sed这是一个简单的sed -i 's/^# deb/deb/' /etc/apt/sources.list这是做什么最优雅(“pythonic”)的方法?

Solutions Collecting From Web of "如何做文本replace像python?"

massedit.py( http://github.com/elmotec/massedit )为你留下的脚手架只留下正则表达式来写。 它仍处于测试阶段,但我们正在寻找反馈。

 python -m massedit -e "re.sub(r'^# deb', 'deb', line)" /etc/apt/sources.list 

将显示差异格式之前/之后的差异。

添加-w选项以将更改写入原始文件:

 python -m massedit -e "re.sub(r'^# deb', 'deb', line)" -w /etc/apt/sources.list 

或者,您现在可以使用api:

 >>> import massedit >>> filenames = ['/etc/apt/sources.list'] >>> massedit.edit_files(filenames, ["re.sub(r'^# deb', 'deb', line)"], dry_run=True) 

你可以这样做:

 with open("/etc/apt/sources.list", "r") as sources: lines = sources.readlines() with open("/etc/apt/sources.list", "w") as sources: for line in lines: sources.write(re.sub(r'^# deb', 'deb', line)) 

with语句确保文件正确关闭,并在写入文件之前以"w"模式重新打开文件。 re.sub(pattern,replace,string)相当于sed / perl中的s / pattern / replace /。

编辑:在例子中修正语法

没有外部命令或附加依赖的情况下编写一个纯Python的本地sed替代品是一个崇高的任务,埋藏着高贵的地雷。 谁曾想到?

尽管如此,这是可行的。 这也是可取的。 我们都到了那里,人们:“我需要弄明白一些文本文件,但是我只有Python,两个塑料鞋带和一个发霉的罐头级的Maraschino樱桃。

在这个答案中,我们提供了一个最好的解决方案,将先前的答案的迷人性融合在一起,而没有所有这些令人不快的愉快的事情。 正如plundra所指出的那样,David Miller的其他一流答案以非原子方式写入期望的文件,因此会引起竞争条件(例如,来自尝试同时读取该文件的其他线程和/或进程)的竞争条件。 那很糟。 Plundra的其他优秀的答案解决了这个问题,同时又引入了更多 – 包括许多致命的编码错误,严重的安全漏洞(不能保留原始文件的权限和其他元数据),以及用低级别字符索引取代正则表达式的过早优化。 这也不好。

真棒,团结!

 import re, shutil, tempfile def sed_inplace(filename, pattern, repl): ''' Perform the pure-Python equivalent of in-place `sed` substitution: eg, `sed -i -e 's/'${pattern}'/'${repl}' "${filename}"`. ''' # For efficiency, precompile the passed regular expression. pattern_compiled = re.compile(pattern) # For portability, NamedTemporaryFile() defaults to mode "w+b" (ie, binary # writing with updating). This is usually a good thing. In this case, # however, binary writing imposes non-trivial encoding constraints trivially # resolved by switching to text writing. Let's do that. with tempfile.NamedTemporaryFile(mode='w', delete=False) as tmp_file: with open(filename) as src_file: for line in src_file: tmp_file.write(pattern_compiled.sub(repl, line)) # Overwrite the original file with the munged temporary file in a # manner preserving file attributes (eg, permissions). shutil.copystat(filename, tmp_file.name) shutil.move(tmp_file.name, filename) # Do it for Johnny. sed_inplace('/etc/apt/sources.list', r'^\# deb', 'deb') 

这是一个不同的方法,我不想编辑我的其他答案。 因为我没有使用3.1( with A() as a, B() as b:作品)嵌套。

改变sources.list可能有点矫枉过正,但我​​想把它放在那里以备将来的搜索。

 #!/usr/bin/env python from shutil import move from tempfile import NamedTemporaryFile with NamedTemporaryFile(delete=False) as tmp_sources: with open("sources.list") as sources_file: for line in sources_file: if line.startswith("# deb"): tmp_sources.write(line[2:]) else: tmp_sources.write(line) move(tmp_sources.name, sources_file.name) 

这应该确保没有其他人阅读文件的竞争条件。 哦,我更喜欢str.startswith(…),当你可以做没有正则表达式。

如果您使用Python3,以下模块将帮助您: https : //github.com/mahmoudadel2/pysed

 wget https://raw.githubusercontent.com/mahmoudadel2/pysed/master/pysed.py 

将模块文件放入Python3模块路径,然后:

 import pysed pysed.replace(<Old string>, <Replacement String>, <Text File>) pysed.rmlinematch(<Unwanted string>, <Text File>) pysed.rmlinenumber(<Unwanted Line Number>, <Text File>) 

尝试https://pypi.python.org/pypi/pysed

pysed -r'#deb''deb'/etc/apt/sources.list

不知道优雅,但这至少应该是相当可读的。 对于一个sources.list来说,可以事先阅读所有的代码,对于更大的代码,你可能需要在“循环”的时候改变它。

 #!/usr/bin/env python # Open file for reading and writing with open("sources.list", "r+") as sources_file: # Read all the lines lines = sources_file.readlines() # Rewind and truncate sources_file.seek(0) sources_file.truncate() # Loop through the lines, adding them back to the file. for line in lines: if line.startswith("# deb"): sources_file.write(line[2:]) else: sources_file.write(line) 

编辑 :使用with -statement更好的文件处理。 在截断之前也忘了倒带。

你可以做这样的事情:

 p = re.compile("^\# *deb", re.MULTILINE) text = open("sources.list", "r").read() f = open("sources.list", "w") f.write(p.sub("deb", text)) f.close() 

另外(从组织的角度来看,这样做更好),你可以将你的sources.list分成几块(一个入口/一个存储库),并放在/etc/apt/sources.list.d/

以下是perl -p单模块Python替换:

 # Provide compatibility with `perl -p` # Usage: # # python -mloop_over_stdin_lines '<program>' # In, `<program>`, use the variable `line` to read and change the current line. # Example: # # python -mloop_over_stdin_lines 'line = re.sub("pattern", "replacement", line)' # From the perlrun documentation: # # -p causes Perl to assume the following loop around your # program, which makes it iterate over filename arguments # somewhat like sed: # # LINE: # while (<>) { # ... # your program goes here # } continue { # print or die "-p destination: $!\n"; # } # # If a file named by an argument cannot be opened for some # reason, Perl warns you about it, and moves on to the next # file. Note that the lines are printed automatically. An # error occurring during printing is treated as fatal. To # suppress printing use the -n switch. A -p overrides a -n # switch. # # "BEGIN" and "END" blocks may be used to capture control # before or after the implicit loop, just as in awk. # import re import sys for line in sys.stdin: exec(sys.argv[1], globals(), locals()) try: print line, except: sys.exit('-p destination: $!\n') 

如果您真的想在不安装新的Python模块的情况下使用sed命令,则可以简单地执行以下操作:

 import subprocess subprocess.call("sed command") 

我希望能够找到并替换文本,但也包括我插入的内容中的匹配组。 我写了这个简短的脚本来做到这一点:

https://gist.github.com/turtlemonvh/0743a1c63d1d27df3f17

这个关键部分是这样的:

 print(re.sub(pattern, template, text).rstrip("\n")) 

这是一个如何工作的例子:

 # Find everything that looks like 'dog' or 'cat' followed by a space and a number pattern = "((cat|dog) (\d+))" # Replace with 'turtle' and the number. '3' because the number is the 3rd matched group. # The double '\' is needed because you need to escape '\' when running this in a python shell template = "turtle \\3" # The text to operate on text = "cat 976 is my favorite" 

用这个函数调用上面的函数:

 turtle 976 is my favorite 

Python有一个正则表达式模块(import re)。 为什么你不想在perl中使用它。 它具有perl正则表达式的所有功能