Tutorials

Remove duplicate lines from text files (with sort)

A quick method to remove duplicates from text files - including for example CSV files - where multiple records have been added (perhaps automatically) at different times resulting in multiple copies of the same record scattered throughout the file. Here is a simple one-liner bash command to remove duplicates using sort.

This method is sensitive to the line endings of the file. If you have files editing in a combination of Unix/Linux/Mac/Windows you may have a variety of line-endings in place.

The simplest route is to run the file through dos2unix before attempting the sort/unique filter.

Requirements

sort

Method

In a bash shell enter:

python
sort -u file.csv -o file.csv

This takes your file, sorts it (using sort), gets the unique entries (-u) and writes it to the outfile (-o) which here is the same as the initial file.

PyQt/PySide 1:1 Coaching with Martin Fitzpatrick

Save yourself time and frustration. Get one on one help with your Python GUI projects. Working together with you I'll identify issues and suggest fixes, from bugs and usability to architecture and maintainability.

Book Now 60 mins ($195)

Elsewhere

Customize directory colors
Find all files containing a given string
Collaborate in the shell with screen (multiuser)
Introduction to the QGraphics Framework in PyQt6