I want to remove duplicate lines without changing the order, so only keeping the first occurrence.
If i am not mistaken, the following should work:
cat -n input.txt | sort -uk2 | sort -nk1 | cut -f2- > output.txt
None of these commands needs to hold the file in memory all at once.
But as mentioned previously,
sort does need quite some disk space to create temporary files. So that might be a bottleneck, depending on your system specs.