It occurs an error when join function is called with the following values.
# TypeError: sequence item 0: expected str instance, list found
"".join([["first"], ["second"]])
"".join([["1", 2, 3], [4, "5", 6]])
"".join([[[11], [21, 22]], [[31, 32], [41, 42]]])
This article solves the problem.
How join function works
I expected that join function is used for a list with a separator like in TypeScript/JavaScript.
However, its syntax is the opposite.
In TypeScript/JavaScript
["one", "two", "three"].join(",");
In Python
",".join(["one", "two", "three"]) # one,two,three
Let’s check the behavior first with some examples.
print("--- ".join("a")) # a
print("--- ".join("abc")) # a--- b--- c
print("--- ".join(["a", "b", "c"])) # a--- b--- c
print("--- ".join(["ab", "cd", "ef"])) # ab--- cd--- ef
The string on the left side is used as a separator. If the right side is a string, it is split into a single character.
Error cases that join cannot handle
List of list sequence item 0: expected str instance, list found
If the list consists of lists, an error occurs.
try:
print("--- ".join([["first"], ["second"]]))
except BaseException as e:
# sequence item 0: expected str instance, list found
print(e)
List containinig non-string value (mixed data type)
If the list contains non-string values, which are int here, Pylance shows an error and it can’t be executed.
# Expression value is unused Pylance(reportUnusedExpression)
# "join" is not defined Pylance(reportUndefinedVariable)
",",join([1,2,3])
# Argument of type "list[str | int]" cannot be assigned to parameter "__iterable" of type "Iterable[str]" in function "join"
# "Literal[2]" is incompatible with "str" Pylance(reportGeneralTypeIssues)
",",join(["1",2,3])
Solutions
I will try to define several functions. To check the behavior, I use the following dataset. The expected output is written as a comment.
TEST_DATASET = [
[1, 2, 3], # "123"
["1", "2", "3"], # "123"
["1", 2, 3], # "123"
[["first"], ["second"]],# firstsecond
[["1", "2"], ["3"]], # "123"
[[1, 2, 3], [4, 5, 6]], # "123456"
[["1", 2, 3], [4, "5", 6]], # "123456"
[[["fir"], ["st"]], [["se", "co"], ["nd"]]], # firstsecond
[[[11], [21, 22]], [[31, 32], [41, 42]]], # "11212231324142"
]
The function to run the test is the following.
def run_test(callback, values):
try:
intermediate, result = callback(values)
print(f"intermediate: {intermediate}")
print(f"RESULT: {values} -> {result}")
except Exception as err:
print(f"ERROR: {values}, {format(err)}")
finally:
print()
Solution 1 for-in loop only for string list
The first solution is to use a for-in loop.
def solution1(values):
intermediate = ["".join(element) for element in values]
result = "".join(intermediate)
return intermediate, result
[run_test(solution1, values) for values in TEST_DATASET]
# ERROR: [1, 2, 3], can only join an iterable
# intermediate: ['1', '2', '3']
# RESULT: ['1', '2', '3'] -> 123
# ERROR: ['1', 2, 3], can only join an iterable
# intermediate: ['first', 'second']
# RESULT: [['first'], ['second']] -> firstsecond
# intermediate: ['12', '3']
# RESULT: [['1', '2'], ['3']] -> 123
# ERROR: [[1, 2, 3], [4, 5, 6]], sequence item 0: expected str instance, int found
# ERROR: [['1', 2, 3], [4, '5', 6]], sequence item 1: expected str instance, int found
# ERROR: [[['fir'], ['st']], [['se', 'co'], ['nd']]], sequence item 0: expected str instance, list found
# ERROR: [[[11], [21, 22]], [[31, 32], [41, 42]]], sequence item 0: expected str instance, list found
If the list contains only string values, this solution works well.
Solution 2 cast to str in for-in for mixed data type
If you need to process a list that has mixed data types, you need to cast the value to string.
def solution2(values):
intermediate = ["".join(str(element)) for element in values]
result = "".join(intermediate)
return intermediate, result
[run_test(solution2, values) for values in TEST_DATASET]
# intermediate: ['1', '2', '3']
# RESULT: [1, 2, 3] -> 123
# intermediate: ['1', '2', '3']
# RESULT: ['1', '2', '3'] -> 123
# intermediate: ['1', '2', '3']
# RESULT: ['1', 2, 3] -> 123
# intermediate: ["['first']", "['second']"]
# RESULT: [['first'], ['second']] -> ['first']['second']
# intermediate: ["['1', '2']", "['3']"]
# RESULT: [['1', '2'], ['3']] -> ['1', '2']['3']
# intermediate: ['[1, 2, 3]', '[4, 5, 6]']
# RESULT: [[1, 2, 3], [4, 5, 6]] -> [1, 2, 3][4, 5, 6]
# intermediate: ["['1', 2, 3]", "[4, '5', 6]"]
# RESULT: [['1', 2, 3], [4, '5', 6]] -> ['1', 2, 3][4, '5', 6]
# intermediate: ["[['fir'], ['st']]", "[['se', 'co'], ['nd']]"]
# RESULT: [[['fir'], ['st']], [['se', 'co'], ['nd']]] -> [['fir'], ['st']][['se', 'co'], ['nd']]
# intermediate: ['[[11], [21, 22]]', '[[31, 32], [41, 42]]']
# RESULT: [[[11], [21, 22]], [[31, 32], [41, 42]]] -> [[11], [21, 22]][[31, 32], [41, 42]]
This solution works for a list that has int values, mixed data types, and string values. But not for a list of list.
Solution 3 using map with cast
def solution3(values):
intermediate = ["".join(map(str, element)) for element in values]
result = "".join(intermediate)
return intermediate, result
[run_test(solution3, values) for values in TEST_DATASET]
# ERROR: [1, 2, 3], 'int' object is not iterable
# intermediate: ['1', '2', '3']
# RESULT: ['1', '2', '3'] -> 123
# ERROR: ['1', 2, 3], 'int' object is not iterable
# intermediate: ['first', 'second']
# RESULT: [['first'], ['second']] -> firstsecond
# intermediate: ['12', '3']
# RESULT: [['1', '2'], ['3']] -> 123
# intermediate: ['123', '456']
# RESULT: [[1, 2, 3], [4, 5, 6]] -> 123456
# intermediate: ['123', '456']
# RESULT: [['1', 2, 3], [4, '5', 6]] -> 123456
# intermediate: ["['fir']['st']", "['se', 'co']['nd']"]
# RESULT: [[['fir'], ['st']], [['se', 'co'], ['nd']]] -> ['fir']['st']['se', 'co']['nd']
# intermediate: ['[11][21, 22]', '[31, 32][41, 42]']
# RESULT: [[[11], [21, 22]], [[31, 32], [41, 42]]] -> [11][21, 22][31, 32][41, 42]
This solution solves a couple of problems that solution 2 can’t solve but doesn’t work for a list including int values.
Solution 4 flatten the list first (perfect solution)
If a list has another list, casting to str results in something like "[1, 2, 3]"
. To solve this problem, what we can do is to flatten the list before processing.
def flat(element) -> list:
has_list = any([isinstance(x, list) for x in element])
if not has_list:
return element
flatten_list = []
for x in element:
if isinstance(x, list):
val = flat(x)
flatten_list.extend(val)
else:
flatten_list.append(x)
return flatten_list
[print(flat(values)) for values in TEST_DATASET]
# [1, 2, 3]
# ['1', '2', '3']
# ['1', 2, 3]
# ['first', 'second']
# ['1', '2', '3']
# [1, 2, 3, 4, 5, 6]
# ['1', 2, 3, 4, '5', 6]
# ['fir', 'st', 'se', 'co', 'nd']
# [11, 21, 22, 31, 32, 41, 42]
Check the following article if you want to know other ways to flatten list.
All lists are flattened as expected. If we use this function, join function works perfectly.
def solution4(values):
flatten_list = flat(values)
intermediate = ["".join(str(element)) for element in flatten_list]
result = "".join(intermediate)
return intermediate, result
[run_test(solution4, values) for values in TEST_DATASET]
# intermediate: ['1', '2', '3']
# RESULT: [1, 2, 3] -> 123
# intermediate: ['1', '2', '3']
# RESULT: ['1', '2', '3'] -> 123
# intermediate: ['1', '2', '3']
# RESULT: ['1', 2, 3] -> 123
# intermediate: ['first', 'second']
# RESULT: [['first'], ['second']] -> firstsecond
# intermediate: ['1', '2', '3']
# RESULT: [['1', '2'], ['3']] -> 123
# intermediate: ['1', '2', '3', '4', '5', '6']
# RESULT: [[1, 2, 3], [4, 5, 6]] -> 123456
# intermediate: ['1', '2', '3', '4', '5', '6']
# RESULT: [['1', 2, 3], [4, '5', 6]] -> 123456
# intermediate: ['fir', 'st', 'se', 'co', 'nd']
# RESULT: [[['fir'], ['st']], [['se', 'co'], ['nd']]] -> firstsecond
# intermediate: ['11', '21', '22', '31', '32', '41', '42']
# RESULT: [[[11], [21, 22]], [[31, 32], [41, 42]]] -> 11212231324142
Overview
- String list
- List containing string list
"".join(["".join(element) for element in values])
# Available for the following lists
# ['1', '2', '3'] -> 123
# [['first'], ['second']] -> firstsecond
# [['1', '2'], ['3']] -> 123
- List containing non-string data type
- NG for list containing list
"".join(["".join(str(element)) for element in values])
# Available for the following lists
# [1, 2, 3] -> 123
# ['1', '2', '3'] -> 123
# ['1', 2, 3] -> 123
- List containing list with non-string data type
- NG for a flat list with non-string data type
"".join(["".join(map(str, element)) for element in values])
# ['1', '2', '3'] -> 123
# [['first'], ['second']] -> firstsecond
# [['1', '2'], ['3']] -> 123
# [[1, 2, 3], [4, 5, 6]] -> 123456
# [['1', 2, 3], [4, '5', 6]] -> 123456
- Can be used for all list types
"".join(["".join(str(element)) for element in flat(values)])
# [1, 2, 3] -> 123
# ['1', '2', '3'] -> 123
# ['1', 2, 3] -> 123
# [['first'], ['second']] -> firstsecond
# [['1', '2'], ['3']] -> 123
# [[1, 2, 3], [4, 5, 6]] -> 123456
# [['1', 2, 3], [4, '5', 6]] -> 123456
# [[['fir'], ['st']], [['se', 'co'], ['nd']]] -> firstsecond
# [[[11], [21, 22]], [[31, 32], [41, 42]]] -> 11212231324142
Comments