The most commonly used convention is to define the natural numbers first and then say that a list is a function whose domain is an initial (finite) segment of the naturals. Then,
$$ [a,b,c] = \{(0,a), (1,b), (2,c)\} $$
(This is convenient in set theory, e.g., because it lets you prove without Replacement that if $A$ is a set, then there is a set of all lists of elements from $A$).
If this is not to your liking you might also take a page from Lisp and represent the empty list as $\varnothing$, and a non-empty list as the ordered pair of the first element and the representation of the rest of the list:
$$ [a,b,c] = (a,(b,(c,\varnothing))) $$
(This works because $\varnothing$ cannot be a Kuratowski pair).
Well, you could define $(a,b,c)$ as $((a,b),(c,c))$, and then it is an element of $\mathscr{P}^4(X)$. More generally, if $n\leq 2^m$, you can represent $n$-tuples as a tree of $m$-tuply nested ordered pairs, and thus as elements of $\mathscr{P}^{2m}(X)$. This shows that $$k^n\leq f^{2\lceil \log_2 n\rceil}(k)$$ where $f(x)=2^x$. (Of course, this bound can also be obtained by just iterating $k^2\leq 2^{2^k}$ to get $k^4=(k^2)^2\leq (2^{2^k})^2\leq 2^{2^{2^{2^k}}}$ and so on.)
You can do better at least in some cases, though. For instance, you can represent triples in $\mathscr{P}^3(X)$ by defining $(a,b,c)$ as $\{(a,b),(a,c),(b,c)\}$. More generally, you can represent an $n$-tuple as the set of all the $(n-1)$-tuples obtained by deleting one entry in the $n$-tuple (for any $n>2$), so inductively this represents $n$-tuples as elements of $\mathscr{P}^n(X)$. This gives the bound $$k^n\leq f^n(k),$$ which is better than the previous bound for $n=3$ and $n=5$. Or, you can use the earlier representation of $2^m$-tuples as elements of $\mathscr{P}^{2m}(X)$ to represent $(2^m+1)$-tuples as elements of $\mathscr{P}^{2m+1}(X)$, which is slightly better than the $\mathscr{P}^{2m+2}(X)$ that would be given by the first method.
To prove that this works, let $s$ be an $n$-tuple and let $R$ be the set of $(n-1)$-tuples obtained by removing an entry from $s$; we will recover $s$ from $R$. If all elements of $R$ have the same first entry (say, $a$), then $a$ must be the first entry of $s$. Moreover, there is then a unique element of $R$ which starts with fewer $a$s than every other element of $R$ (namely the $(n-1)$-tuple obtained by removing the first entry of $s$), and that element is the remaining $n-1$ entries of $s$.
So, we may assume that not all elements of $R$ have the same first entry. If there is some $a$ such that two different elements of $R$ start with $a$, then $a$ must be the first entry of $s$ and the remaining entries are given by the unique element of $R$ that doesn't start with $a$. Thus, we may assume that $R$ has only two different elements, say one starting with $a$ and another starting with $b$. But this means the entries of $s$ can only be $a$ and $b$ (if $s$ had three distinct entries, they would give three distinct elements of $R$). Moreover, all the $a$s must be consecutive, since removing $a$s in different consecutive blocks would give different elements of $R$, and similarly the $b$s must be consecutive. We can count how many $a$s there are in $s$ (the maximum that there are in any element of $R$) and similarly for the $b$s, and we can tell whether the $a$s come first or the $b$s come first since $n>2$. Thus, we can recover $s$ from the set $R$.
For $n=3$, at least, this is optimal in the following sense: it is impossible to represent ordered triples as elements of $\mathscr{P}(\mathscr{P}(X))$ (by a formula that sends a triple $(a,b,c)$ to some doubly nested set expression in $a,b,$ and $c$). Clearly such a representation of a triple $(a,b,c)$ would have to involve all three of $a,b,$ and $c$. But now consider the 6 triples $(a,a,b),(b,b,a),(a,b,a),(b,a,b),(b,a,a),(a,b,b)$. Each of these must be represented by a distinct element of $\mathscr{P}(\mathscr{P}(\{a,b\}))$ which is not fixed if you swap $a$ and $b$. Thus, each one must contain exactly one of $\{a\}$ and $\{b\}$. This means that each one also must contain $\{a,b\}$, since the formula for $(a,b,c)$ must involve all three of $a,b,$ and $c$. But now we have a problem: there are only 4 different subsets of $\mathscr{P}(\{a,b\})$ satisfying this constraint, so our 6 triples cannot all be distinct.
Best Answer
As others have mentioned, it's typical to define $$(a_1,\ldots, a_n)=((a_1,a_2,\ldots, a_{n-1}),a_n)$$ though I like to call them "ordered $n$-tuplets" to distinguish from the alternative definition I give below (whose naming convention is rather standard when used for infinite sets, in which the above definition cannot be extended).
I think it's worth noting that there are a lot of different ways to extend the notion of an ordered-pair. For example, once ordered-pairs and ordered-triplets have been defined, you can define $A\times B$ for two sets $A$ and $B$ and a function as an ordered-triple $(A,B,f)$, where $f\subset A\times B$ is well-defined and left-total. For there, you can then "redefine" $n$-tuples to be surjective maps $f: \{0,\ldots, n-1\}\rightarrow A$. (Strictly speaking, we could get away with only having ordered-pairs if we want to just define an $n$-tuple to be a subset of $\{0,\ldots, n-1\}\times A$ for some set $A$ that is well-defined and left-total. But defining functions as triples is useful because it allows one to talk about surjectivity.)
Another approach is defining $(a,b):=\{a,\{a,b\}\}$, though that this satisfies the property that $(a,b)=(c,d)$ implies $a=c,b=d$ requires the Axiom of Foundation.
For the sake of differentiating between the ordered $n$-tuplets as defined above and these $n$-tuplets, I'll denote an $n$-tuple as $\langle a_0,\ldots, a_{n-1}\rangle$ where $a_0,\ldots, a_{n-1}$ are the images of the function $f$ on $0,\ldots, n-1$, respectively. The nice property that $n$-tuples have that $n$-tuplets don't have is that given an $n$-tuple $\mathbf{a}=\langle a_0,\ldots, a_{n-1}\rangle$ and an $m$-tuple $\mathbf{b}=\langle b_0,\ldots, b_{m-1}\rangle$, we have $\mathbf{a}=\mathbf{b}$ if and only if $n=m$ and $a_i=b_i$ for each $i\in\{0,\ldots, n-1\}=\{0,\ldots, m-1\}$, the reason being that two functions being equal implies their domains are equal.
As mentioned above, when we want to extend the notion of an $n$-tuple(t) to deal with indexing things by arbitrary sets, we have to use the notion of an $n$-tuple, defining an $I$-tuple (or a family of sets indexed by $I$) to be a surjective map $f:I\rightarrow A$. If want to regard these $I$-tuples as simply being elements of some infinite Cartesian product $\prod_{i\in I}{A_i}$ (where $(A_i)_{i\in I}$ is a family of sets indexed by $I$), we could choose the elements to either be indexed families $f:I\rightarrow A$ where $A\subset \bigcup_{i\in I}{A_i}$ and $f(i)\in A_i$ for each $i\in I$ (so $f$ is necessarily surjective by definition), or we could choose to take the elements to be simply functions $f:I\rightarrow \bigcup_{i\in I}{A_i}$ where $f(i)\in A_i$ for each $i\in I$. Restricting to the finite case where $I=\{0,\ldots, n-1\}$, we then have a few different ways we could have defined, say, $A\times B$ (though we did use the Kuratowski definition to define a function in the first place, nothing stops us from then considering as "ordered pairs" $2$-tuples rather than $2$-tuplets from now on).
Given these alternatives exist for definitions of $n$-tuple(t)s or infinite cartesian products, a question might be which is better. Ultimately, it's kind of an arbitrary choice, in the sense that there is a natural way to switch between the two that "preserves" how the canonical projections act on the Cartesian products.