jbreard Posted 22 hours ago Posted 22 hours ago Hello everyone, I'm looking for a solution to efficiently extract from a given list all duplicates. I've got a solution that works in theory but that is ineficient in practice due to the length of the list I want to analyse (approximately 25 000 elements). The list I'm working with is a list of lists. All individual lists decribe a Covadis bloc (topography software) with the format (MAT, ALT, ALTI, (X, Y, Z), ename of the bloc). MAT, ALT and ALTI are attributes value of the bloc and (X, Y, Z) is the insert coordinates. I want to isolate all the blocs that have the same MAT attribute and save them on a separate list. I have come with the solution described in the code below but it's much to slow and absolutely not opimised. ; lst = general list with bloc caracteristics ; sublst format (MAT, ALT, ALTI, (X, Y, Z), ename) (while (setq sublst (car lst)) (setq lst_same_mat (vl-remove nil (mapcar '(lambda(p) (if (equal (car p) (car sublst)) p nil)) lst))) (setq lst_same_mat_save (cons lst_same_mat lst_same_mat_save)) (setq lst (REMOVE_LST1FROM_LST2 lst_same_mat lst_same_mat_save) ) ;;;; ;lst_gen : list we want to suppress certain element from ;lst_suppr : list of elements to be suppressed in lst_gen (defun REMOVE_LST1_FROM_LST2 (lst_suppr lst_gen / ele_suppr lst_modif n) (setq n 0) (setq lst_modif lst_gen) (while (setq ele_suppr (nth n lst_suppr)) (setq lst_modif (vl-remove ele_suppr lst_modif)) (setq n (1+ n)) ) lst_modif ) Would anyone have an idea on how to speed this up for large list ? Thanks and best regards, Jacques Quote
pkenewell Posted 22 hours ago Posted 22 hours ago Try Lee Mac's "List Duplicates" functions at the following link: https://www.lee-mac.com/uniqueduplicate.html#listdupes 1 Quote
Lee Mac Posted 22 hours ago Posted 22 hours ago I'm unsure if you're looking to group the items by the MAT value or something else (sample data and the expected result would help in this regard), but I might suggest manipulating the data into groups using the MAT value as the key, i.e. (defun foo ( lst / ass rtn ) (foreach itm lst (if (setq ass (assoc (car itm) rtn)) (setq rtn (subst (vl-list* (car ass) itm (cdr ass)) ass rtn)) (setq rtn (cons (list (car itm) itm) rtn)) ) ) ) Quote
Lee Mac Posted 21 hours ago Posted 21 hours ago (edited) Here's another example - for the below function, "unq" will contain the items with unique MAT value, and "dup" will contain those for which there is more than one MAT value: (defun foo ( lst / dup key len unq ) (while (setq itm (car lst)) (setq key (car itm) len (length dup) lst (vl-remove-if (function (lambda ( x ) (if (= key (car x)) (setq dup (cons x dup))))) (cdr lst)) ) (if (< len (length dup)) (setq dup (cons itm dup)) (setq unq (cons itm unq)) ) ) (list unq dup) ) Though it's probably more efficient to use the lists directly in your code rather than the above function returning a list of lists - alternatively, output parameters may be used, e.g.: (defun foo ( lst unq-out dup-out / dup key len unq ) (while (setq itm (car lst)) (setq key (car itm) len (length dup) lst (vl-remove-if (function (lambda ( x ) (if (= key (car x)) (setq dup (cons x dup))))) (cdr lst)) ) (if (< len (length dup)) (setq dup (cons itm dup)) (setq unq (cons itm unq)) ) ) (set unq-out unq) (set dup-out dup) nil ) _$ (foo '(("abc" 1) ("def" 2) ("abc" 3) ("xyz" 4) ("abc" 5)) 'unq-sym 'dup-sym) nil _$ unq-sym (("xyz" 4) ("def" 2)) _$ dup-sym (("abc" 1) ("abc" 5) ("abc" 3)) Note that the element order is not preserved with the above. Edited 21 hours ago by Lee Mac Quote
BIGAL Posted 15 hours ago Posted 15 hours ago I asked a similar question for a task. It groups common items not sure if it is useful as no data to test on. ; Thanks to Dexus for groupby sep 2025 (defun _groupBy (fun lst0 / itm old rtn) (while lst0 (setq itm (fun (car lst0)) rtn (if (setq old (assoc itm rtn)) (subst (cons itm (cons (car lst0) (cdr old))) old rtn) (cons (cons itm (list (car lst0))) rtn) ) lst0 (cdr lst0) ) ) (mapcar 'cdr rtn) ) (setq lst (vl-sort lst '(lambda (x y) (< (car x)(car y))))) (setq lst2 (_groupBy (lambda (e) (car e)) lst)) Quote
jbreard Posted 2 hours ago Author Posted 2 hours ago Thank you all for your answers ! I was able to do what I wanted to achieve thanks to you. The function that worked for me was the first one submitted by LeeMac. I only had to had the line below to get exactly the list I wanted : (defun foo ( lst / ass rtn ) (foreach itm lst (if (setq ass (assoc (car itm) rtn)) (setq rtn (subst (vl-list* (car ass) itm (cdr ass)) ass rtn)) (setq rtn (cons (list (car itm) itm) rtn)) ) ) ) ; lst_test = my list with the structure described in my post i.e lst_test = (lst1 lst2 ... lstn) with lstx = (MAT, ALT, COD, ...) (setq lst_test_sort (foo lst_test)) (setq lst_test_dup (vl-remove nil (mapcar '(lambda(p) (if (> (length p) 2) (cdr p) nil)) lst_test_sort))) I also tried the other function supplied by LeeMac but it was much slower than the first one. I was not able (patient enough) to make it work on a list with approx 25 000 elements. Thanks to BIGAL also for the reply. I must confess that I did not have the time to test the proposed solution. I will have a look at it as soon as I can manage some spare time. I really appreciate the time you took to answer me. Best regards, Jacques Quote
Lee Mac Posted 1 hour ago Posted 1 hour ago You might find the following marginally faster - (defun foo ( lst / ass rtn ) (foreach itm lst (if (setq ass (assoc (car itm) rtn)) (setq rtn (subst (vl-list* (car ass) itm (cdr ass)) ass rtn)) (setq rtn (cons (list (car itm) itm) rtn)) ) ) (mapcar 'cdr (vl-remove-if-not 'cddr rtn)) ) Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.