Band structure of Bi2Se3 nanosheets
Our starting point is the \(\varvec{k}\varvec{\cdot }\varvec{p}\) Hamiltonian derived previously61,62 for modeling three-dimensional Bi2Se3 and Bi2Te3 around the \(\Gamma\) point in the Brillouin zone. To account for the quantization in the z-direction in a nanosheet geometry of thickness \(L_{z}\), we solve the model at the two-dimensional \(\Gamma\) point \({(k_{x} = k_{y} = 0)}\) with the substitution \({k_{z} \rightarrow -i \partial _{z}}\) and hard-wall boundary conditions as done before63,64. Everywhere in our numerics we consider \({L_{z} = {6}\,{\hbox {nm}}}\), which in the case of Bi2Se3 is sufficiently large for the single-particle Dirac states on the opposite surfaces not to be gapped out by tunneling processes and the nontrivial topology of the bulk to survive65,66,67, but also sufficiently small for bulk electrons and holes to still behave as in two dimensions. We then project the 3D Hamiltonian onto the energetically highest-lying valence and lowest-lying conduction subbands in order to integrate out the z-dependence. Hence, we assume that the relevant low-energy physics of individual particles is confined to this subspace, which has been verified a posteriori by comparing the obtained exciton binding energies to the energy splitting of the bulk subbands due to the confinement in the z-direction.
Ultimately, after projecting the three-dimensional \(\varvec{k}\varvec{\cdot }\varvec{p}\) Hamiltonian onto the single-particle ground states of the 3D model, we obtain our desired nanosheet Hamiltonian. Written in terms of Pauli matrices in spin and orbital space, \(\varvec{s}\) and \(\varvec{\tau }\) respectively, it is given by the \(4\times 4\) matrix
$$\begin{aligned} H_0(\varvec{k}) = \epsilon _0(\varvec{k}) + M(\varvec{k})\tau _z + A_2(k_x s_x + k_y s_y) \tau _x ~ , \end{aligned}$$
(7)
where \(\varvec{k} \equiv (k_{x}, k_{y})\) is the in-plane momentum, \(\epsilon _0(\varvec{k}) \equiv E + D_2 (k_x^2 + k_y^2)\), and \(M(\varvec{k}) \equiv M – B_2 (k_x^2 + k_y^2)\). Note that products of matrices in different spaces, e.g., \(s_x \tau _x\), are Kronecker products and not matrix products, and that identity matrices are implied. Our effective two-dimensional Hamiltonian is equivalent to that of the BHZ model after a suitable unitary transformation. Furthermore, as expressed in Eq. (7) it has the same form as the three-dimensional Bi2Se3 Hamiltonian with \(k_{z} = 0\) and some renormalized values of the parameters with respect to those given in Ref.61 However, the basis in which it is expressed differs from that of the 3D model, since the original \(\textrm{Bi}^{+}\) and \(\textrm{Se}^{-}\) orbitals have hybridized in the corresponding eigenstates. We denote these hybridized orbitals by \(\mathrm {Bi’}^{+}\) and \(\mathrm {Se’}^{-}\). The renormalized values of the parameters, with which all of our numerical results have been obtained, are \(A_2 = {0.41}\,{\hbox {eV nm}}\), \(M = {0.28}\,{\hbox {eV}}\), \(B_2 = {0.473}\,\hbox {eV}\,{\hbox {nm}^2}\), \(E = {-\,0.0012}\,{\hbox {eV}}\), and \(D_2 = {0.202}\,\hbox {eV}\,{\hbox {nm}^2}\).
Diagonalization of \(H_{0}(\varvec{k})\) leads to the conduction and valence band energies
$$\begin{aligned} \epsilon _{c,v}(\varvec{k}) = \epsilon _0(\varvec{k}) \pm \sqrt{M(\varvec{k})^2 + A_2^2(k_x^2+k_y^2)} ~, \end{aligned}$$
(8)
as well as to the topologically nontrivial single-particle states that we use throughout in our description of excitons. As a consequence of time-reversal symmetry in combination with inversion symmetry, the Hamiltonian does not couple the two subspaces \(\{ | \mathrm {Bi’}^{+};\uparrow \rangle ,| \mathrm {Se’}^{-};\downarrow \rangle \}\) and \(\{ | \mathrm {Bi’}^{+};\downarrow \rangle ,| \mathrm {Se’}^{-};\uparrow \rangle \}\), and the conduction and valence bands are each two-fold degenerate. The corresponding eigenstates are labeled by their so-called spin-orbit parity, defined as the eigenvalue of the operator \(s_{z} \tau _{z}\) that commutes with \(H_{0}(\varvec{k})\).
We next introduce the vector of pseudospin operators \(\varvec{\Gamma } \equiv (s_x\tau _x,\,s_y\tau _x,\,\tau _z)\), which are used to rewrite Eq. (7) as \(H_0(\varvec{k}) = \epsilon _0(\varvec{k}) + \varvec{d}(\varvec{k}) \varvec{\cdot } \varvec{\Gamma }\), with \(\varvec{d}(\varvec{k}) \equiv (A_2 k_x, \, A_2 k_y, \, M(\varvec{k}))\). The nontrivial topology of the Hamiltonian is now very explicit in this form, since \(\varvec{d}(\varvec{k})\) is a skyrmion texture in the momentum plane \((k_x, k_y)\) and the operator \(\varvec{\Gamma }\) reduces exactly to the three Pauli matrices in the uncoupled even and odd spin-orbit-parity subspaces. Therefore, the expectation value of the pseudospin as a function of \(\varvec{k}\) follows the winding of \(\varvec{d}(\varvec{k})\) and the winding number of the latter is up to a possible sign equal to the Chern number of the conduction and valence bands. The winding number for the subspace with spin-orbit parity \(s \in \{+,-\}\) is \(\frac{s}{2}({\text {sgn}} M + {\text {sgn}} B_{2})\), which is nontrivial when M and \(B_{2}\) have the same sign.
Exciton basis and wave functions
The conduction and valence states are \(\langle \varvec{x};a | \varvec{k};\mu \rangle = (e^{i \varvec{k}\varvec{\cdot }\varvec{x}}/\sqrt{V}) \langle a | \chi _{\varvec{k}}^{\mu } \rangle\), where \(V = L_x L_y\), the states \(| a \rangle\) denote our four-dimensional combined spin and orbital basis states, and \(| \chi ^{\mu }_{\varvec{k}} \rangle\) with \(\mu \in \{c,v\} \times \{+,-\}\) are the eigenstates of the Hamiltonian in Eq. (7). Note that the latter are not periodic due to the fact that our Hamiltonian is derived from \(\varvec{k} \varvec{\cdot } \varvec{p}\) theory and only accurately models the band structure around the \(\Gamma\) point. An exciton state \(X \equiv \{\varvec{Q},s,t\}\) is labeled by the total momentum \(\varvec{Q}\) and the spin-orbit-parities s and t of the electron and hole bands, respectively, as well as by an additional set of ro-vibrational quantum numbers that we introduce later. Such an exciton state is a bound state in the polarization68, which in second quantization is determined by the pair correlation function
$$\begin{aligned}&\big \langle {\hat{\psi }}_{a}(\varvec{x}_1) {\hat{\psi }}_{b}^\dagger (\varvec{x}_2)\big \rangle _{X} \nonumber \\&\quad = \sum _{\varvec{k}} \Phi ^{s,t}_{\varvec{Q},\varvec{k}} \langle \varvec{x}_1;a | \varvec{Q}/2 + \varvec{k};c,{s} \rangle \langle -\varvec{Q}/2 + \varvec{k}; v,{t} | \varvec{x}_2;b \rangle \nonumber \\&\quad = \frac{e^{i \varvec{Q} \varvec{\cdot } \varvec{R}}}{V} \sum _{\varvec{k}} \Phi ^{s,t}_{\varvec{Q},\varvec{k}} \, e^{i \varvec{k}\varvec{\cdot } (\varvec{x}_1 – \varvec{x}_2)} \langle a | \chi _{\varvec{Q}/2 + \varvec{k}}^{c,{s}} \rangle \langle \chi _{{-}\varvec{Q}/2 + \varvec{k}}^{v,{t}} | b \rangle ~, \end{aligned}$$
(9)
where \(s,t \in \{+,-\}\) and \(\varvec{R} \equiv (\varvec{x}_{1} + \varvec{x}_{2})/2\) is the position of the exciton. Here \(\Phi ^{s,t}_{\varvec{Q},\varvec{k}}\) is the relative wave function of the exciton in momentum space. One may also perform a particle-hole transformation so that holes become positive-energy excitations. In terms of electrons and holes, the conduction states remain unchanged, \(\langle a | \chi _{\varvec{q}}^{e,s} \rangle = \langle a | \chi _{\varvec{q}}^{c,s} \rangle\). By contrast, the hole states satisfy \(\langle b | \chi _{-\varvec{q}}^{h,t} \rangle = \langle \chi _{\varvec{q}}^{v,t} | b \rangle\). In this picture the pair correlation function thus reads
$$\begin{aligned} \big \langle {\hat{\psi }}_{a}&(\varvec{x}_1){\hat{\psi }}_{b}(\varvec{x}_2)\big \rangle _{X} \nonumber \\&\quad = \frac{e^{i \varvec{Q} \varvec{\cdot } \varvec{R}}}{V} \sum _{\varvec{k}} \Phi ^{s,t}_{\varvec{Q},\varvec{k}} \, e^{i \varvec{k}\varvec{\cdot } (\varvec{x}_1 – \varvec{x}_2)} \langle a | \chi _{\varvec{Q}/2 + \varvec{k}}^{e,{s}} \rangle \langle b | \chi _{\varvec{Q}/2 – \varvec{k}}^{h,{t}} \rangle ~. \end{aligned}$$
(10)
If desired we can then also obtain a first-quantized wave function, which is given by
$$\begin{aligned} \Psi ^{X}_{ab}(\varvec{x}_1, \varvec{x}_2) =\,&\frac{1}{V} \frac{e^{i \varvec{Q} \varvec{\cdot } \varvec{R}}}{\sqrt{V}} \sum _{\varvec{k}} e^{i \varvec{k}\varvec{\cdot } (\varvec{x}_1 – \varvec{x}_2)} \nonumber \\&\times \frac{1}{\sqrt{2}} \left( \Phi ^{s,t}_{\varvec{Q},\varvec{k}} {\langle a | \chi _{\varvec{Q}/2 + \varvec{k}}^{e,{s}} \rangle}_{1} {\langle b | \chi _{\varvec{Q}/2-\varvec{k}}^{h,{t}} \rangle }_{2} – \Phi ^{s,t}_{\varvec{Q},{-}\varvec{k}} {\langle b | \chi _{\varvec{Q}/2 + \varvec{k}}^{e,{s}} \rangle }_{2} {\langle a | \chi _{\varvec{Q}/2+\varvec{k}}^{h,{t}} \rangle }_{1} \right) ~. \end{aligned}$$
(11)
This is normalized provided that \(\frac{1}{V} \sum _{\varvec{k}} |\Phi ^{s,t}_{\varvec{Q},\varvec{k}}|^{2} = 1\). Note that interchanging both particles, thus \(\varvec{x}_{1} \leftrightarrow \varvec{x}_{2}\) and \((a,1) \leftrightarrow (b,2)\), results in an overall minus sign after we perform the transformation \(\varvec{k} \rightarrow -\varvec{k}\).
The single-particle states, together with the relative momentum states \({| \varvec{k} \rangle }_{\textrm{rel}}\), describe the exciton states \(| \varvec{Q};s,t \rangle\) fully in Dirac notation as
$$\begin{aligned} | \varvec{Q};s,t \rangle = \frac{1}{\sqrt{V}} | \varvec{Q} \rangle \sum _{\varvec{k}} \Phi ^{s,t}_{\varvec{Q},\varvec{k}} {| \varvec{k} \rangle }_{\textrm{rel}} {| \chi _{\varvec{Q}/2 + \varvec{k}}^{e,{s}} \rangle } | \chi _{\varvec{Q}/2-\varvec{k}}^{h,{t}} \rangle ~, \end{aligned}$$
(12)
where \(\langle \varvec{R} | \varvec{Q} \rangle = e^{i\varvec{Q}\varvec{\cdot }{\varvec{R}}} / \sqrt{V}\). The four combinations of the spin-orbit-parity labels s and t lead to four distinct exciton basis states, similar to the singlet and triplet excitons in regular semiconductors. As explained in the next section, the states \(| \varvec{Q};s,t \rangle\) in Eq. (12) are exact eigenstates of the Wannier problem with only the direct interaction included. This is in fact a much used approximation in the literature43, but in our case not sufficiently accurate as the exchange interaction in principle couples these states for \(\varvec{Q} \ne \varvec{0}\). Nevertheless, the above set of states can still be considered as the most appropriate basis for the full excitonic problem. Further note that, as advanced, \(| \varvec{Q};s,t \rangle\) more precisely stands for an entire family of ro-vibrational states which must be labeled by additional quantum numbers describing the different relative wave functions that solve the exciton problem.
The topology of these exciton basis states can now be intuitively understood, as in the single-electron case, from the dependence of the expectation value of the pseudospin \(\varvec{\Gamma }_{e}(\varvec{Q})\) on the momentum \(\varvec{Q}\), obtained from Eq. (12) as
$$\begin{aligned} \varvec{\Gamma }_{e}(\varvec{Q}) = \frac{1}{V} \sum _{\varvec{k}} |\Phi ^{s,t}_{\varvec{Q},\varvec{k}}|^2 \langle \chi _{\varvec{Q}/2+\varvec{k}}^{e,s} | \varvec{\Gamma } | \chi _{\varvec{Q}/2+\varvec{k}}^{e,s} \rangle ~, \end{aligned}$$
(13)
and similarly for the pseudospin of the hole.
To determine the topology of the states \(| \varvec{Q};s,t \rangle\) mathematically more rigorously, we need to compute the winding number of the pseudospin. Note that for the moment we particularize to the case of a globally vanishing exchange interaction, in which case all states \(| \varvec{Q};s,t \rangle\) are uncoupled and therefore have a well-defined Chern number. To access the topological properties we must compute the Berry connection8,9,10,11,12, which in the excitonic case is given by a sum of three distinct terms, \(\varvec{A}_{s,t}(\varvec{Q}) = \varvec{A}^{(e)}_{s,t}(\varvec{Q}) + \varvec{A}^{(h)}_{s,t}(\varvec{Q}) + \varvec{A}^{(\Phi )}_{s,t}(\varvec{Q})\). These read
$$\begin{aligned} \varvec{A}^{(e)}_{s,t}(\varvec{Q})&= -\frac{i}{V} \sum _{\varvec{k}} |\Phi ^{s,t}_{\varvec{Q},\varvec{k}}|^2 \langle \chi _{\varvec{Q}/2+\varvec{k}}^{e,s} | \varvec{\nabla }_{\varvec{Q}} | \chi _{\varvec{Q}/2+\varvec{k}}^{e,s} \rangle ~, \end{aligned}$$
(14)
$$\begin{aligned} \varvec{A}^{(h)}_{s,t}(\varvec{Q})&= -\frac{i}{V} \sum _{\varvec{k}} |\Phi ^{s,t}_{\varvec{Q},\varvec{k}}|^2 \langle \chi _{\varvec{Q}/2-\varvec{k}}^{h,t} | \varvec{\nabla }_{\varvec{Q}} | \chi _{\varvec{Q}/2-\varvec{k}}^{h,t} \rangle ~, \end{aligned}$$
(15)
$$\begin{aligned} \varvec{A}^{(\Phi )}_{s,t}(\varvec{Q})&= -\frac{i}{V} \sum _{\varvec{k}} (\Phi ^{s,t}_{\varvec{Q},\varvec{k}})^{*} \, \varvec{\nabla }_{\varvec{Q}} \Phi ^{s,t}_{\varvec{Q},\varvec{k}} ~. \end{aligned}$$
(16)
Each of these terms contributes to the local Berry curvature or momentum-space magnetic field through \(\varvec{B}^{(i)}_{s,t}(\varvec{Q}) = \varvec{\nabla }_{\varvec{Q}} \times \varvec{A}^{(i)}_{s,t}(\varvec{Q})\), where \(i \in \{e, h, \Phi\}\). Finally, the Berry curvature is connected to the desired winding or Chern number by
$$\begin{aligned} \mathcal {C} = \frac{1}{2\pi } \int \textrm{d}^{2} \varvec{Q} \varvec{\cdot } \varvec{B}(\varvec{Q}) = \frac{1}{2 \pi } \oint \textrm{d} \varvec{Q} \varvec{\cdot } \varvec{A}(\varvec{Q}) ~, \end{aligned}$$
(17)
where the first surface integral is performed over the first Brillouin zone, which in our continuum model amounts to integration over the infinite momentum space, and the equivalent line integral is performed along a contour with \(Q \rightarrow \infty\). It is clear that the terms \(\varvec{B}^{(e)}_{s,t}(\varvec{Q})\) and \(\varvec{B}^{(h)}_{s,t}(\varvec{Q})\) not only contain the direct weighted Berry curvature of the underlying single particles, but also an interference term between the single-particle Berry connection and the excitonic envelope wave function. However, the total contribution of these terms to the exciton Chern number is given only by the single-particle electron or hole Chern number. This can be shown, e.g., for the electron case, by first expanding \(\langle \chi ^{e,s}_{\varvec{Q}/2 + \varvec{k}} | \varvec{\nabla }_{\varvec{Q}} | \chi ^{e,s}_{\varvec{Q}/2 + \varvec{k}} \rangle\) with respect to \(\varvec{k}\) and noticing that \(\varvec{\nabla }_{\varvec{k}} | \chi ^{e,s}_{\varvec{Q}/2 + \varvec{k}} \rangle \big |_{\varvec{k} = \varvec{0}} = 2 \varvec{\nabla }_{\varvec{Q}} | \chi ^{e,s}_{\varvec{Q}/2} \rangle\). In our model one then observes that all terms in the line integral decay faster with Q than the zeroth-order term \(\langle \chi ^{e,s}_{\varvec{Q}/2} | \varvec{\nabla }_{\varvec{Q}} | \chi ^{e,s}_{\varvec{Q}/2} \rangle\) and thus vanish on the contour at infinity. Therefore,
$$\begin{aligned} \mathcal {C}_{e} = {-} \frac{i}{2 \pi } \oint \textrm{d} \varvec{Q} \varvec{\cdot } \bigg (\frac{1}{V} \sum _{\varvec{k}} |\Phi ^{s,t}_{\varvec{Q,\varvec{k}}}|^{2} \langle \chi ^{e,s}_{\varvec{Q}/2} | \varvec{\nabla }_{\varvec{Q}} | \chi ^{e,s}_{\varvec{Q}/2} \rangle \bigg ) ~, \end{aligned}$$
(18)
and the normalization condition \(\frac{1}{V} \sum _{\varvec{k}} |\Phi ^{s,t}_{\varvec{Q,\varvec{k}}}|^{2} = 1\) may now be used. We are then left precisely with the expression for the free electron Chern number.
There can in principle be an additional contribution to the Chern number due to the envelope wave function itself, given by \(\mathcal {C}_{\Phi } = \frac{1}{2 \pi } \oint \textrm{d} \varvec{Q} \varvec{\cdot } \varvec{A}^{(\Phi )}_{s,t}(\varvec{Q})\). However, this can be shown to vanish here by analyzing the winding of the direct interaction around the origin of \(\varvec{Q}\). This winding is trivial, meaning that the potential does not pick up a phase as one circles around this point. That is, \(V^{\textrm{D}}(\varvec{Q}; k, \phi _{\varvec{k}}, k’, \phi _{\varvec{k}’}) = {V^{\textrm{D}}(Q \hat{\varvec{x}}; k, \phi _{\varvec{k}} – \phi _{\varvec{Q}}, k’, \phi _{\varvec{k}’} – \phi _{\varvec{Q}})}\), where \(\phi _{\varvec{q}}\) is the angle between the momentum \(\varvec{q}\) and the x-axis. A straightforward analysis of the exciton eigenproblem then shows that we can choose a gauge such that \(\Phi ^{s,t}_{\varvec{Q}}(k, \phi _{\varvec{k}}) = \Phi ^{s,t}_{Q \hat{\varvec{x}}}(k, \phi _{\varvec{k}} – \phi _{\varvec{Q}})\). Observing that the system enjoys reflection symmetry with respect to the x-axis when \(\varvec{Q} = Q \hat{\varvec{x}}\), this may now be used to verify that \(\mathcal {C}_{\Phi } = 0\).
We thus conclude that all global topological properties of the excitons are introduced by the electron and hole states \(| \chi _{\varvec{Q}/2 + \varvec{k}}^{e,{s}} \rangle\) and \(| \chi _{\varvec{Q}/2-\varvec{k}}^{h,{t}} \rangle\). Hence, the total Chern number of the above exciton basis states is \(\mathcal {C} = \mathcal {C}_{e} + \mathcal {C}_{h}\), which for our BHZ model becomes \(\mathcal {C} = s + t\) by explicit calculation. Physically, this result can be most easily understood by the fact that the even and odd spin-orbit-parity subspaces are related by time-reversal symmetry and that the wave function for the hole is the complex conjugate of the electronic valence-band wave function, as we have seen. Note, however, that the local properties that influence for instance the electron and hole transport are quantified by the Berry curvature, and are thus still dependent on the precise shape of the relative wave functions and the interference terms.
We stress that the intuitive picture given above is only valid in the case of zero exchange interaction. As seen in the following section, when this is included the two subspaces with \(s = t\) become coupled together and cannot be treated individually. The Chern number is then technically not well-defined since the time-reversal symmetry protects the degeneracy at \(\varvec{Q} = \varvec{0}\). However, the nontrivial winding caused by the underlying Chern numbers remains, and as a result the full excitonic eigenstates still possess a chirality of \(\pm 2\). This is crucially dependent on the fact that the exchange interaction that couples the two distinct subspaces winds nontrivially around the origin of \(\varvec{Q}\), transforming instead as \({V^{\textrm{X}}(\varvec{Q}; k, \phi _{\varvec{k}}, k’, \phi _{\varvec{k}’}) = e^{{-}2 i \phi _{\varvec{Q}}} V^{\textrm{X}}(Q \hat{\varvec{x}}; k, \phi _{\varvec{k}} – \phi _{\varvec{Q}}, k’, \phi _{\varvec{k}’} – \phi _{\varvec{Q}})}\).
Electron-hole interaction and Bethe–Salpeter equation
Having introduced the free part of the full Hamiltonian, we now proceed to discuss the electron-hole interaction potential that binds these particles together to form an exciton state. The interaction potential is \(V_{s,t;s’,t’}(\varvec{Q};\varvec{k},\varvec{k}’) = V^{\textrm{D}}_{s,t;s’,t’}(\varvec{Q};\varvec{k},\varvec{k}’) – V^{\textrm{X}}_{s,t;s’,t’}(\varvec{Q};\varvec{k},\varvec{k}’)\), where \(V^{{\text{D}}}\) and \(V^{{\text{X}}}\) denote the direct and exchange interactions, respectively. These are given by
$$\begin{aligned} V^{\textrm{D}}_{s,t;s’,t’}&(\varvec{Q};\varvec{k},\varvec{k}’) =\delta _{s,s’} \delta _{t,t’} V(\varvec{k} – \varvec{k’}) \langle \chi _{\varvec{Q}/2+\varvec{k}}^{c,{s}} | \chi _{\varvec{Q}/2+\varvec{k}’}^{c,{s}} \rangle \langle \chi _{-\varvec{Q}/2+\varvec{k}’}^{v,{t}} | \chi _{-\varvec{Q}/2+\varvec{k}}^{v,{t}} \rangle ~, \end{aligned}$$
(19)
$$\begin{aligned} V^{\textrm{X}}_{s,t;s’,t’}&(\varvec{Q};\varvec{k},\varvec{k}’)=\delta _{s,t} \delta _{s’,t’} V(\varvec{Q}) \langle \chi _{\varvec{Q}/2+\varvec{k}}^{c,{s}} | \chi _{-\varvec{Q}/2+\varvec{k}}^{v,s} \rangle \langle \chi _{-\varvec{Q}/2+\varvec{k}’}^{v,{s’}} | \chi _{\varvec{Q}/2+\varvec{k}’}^{c,s’} \rangle ~, \end{aligned}$$
(20)
where \(V(\varvec{q})\) is the bare electrostatic potential within the nanosheet. A variational approach with the trial wave functions in the previous section leads to the Bethe-Salpeter equation69,70
$$\begin{aligned} \sum _{{\varvec{k}}’, s’, t’} \langle {\varvec{Q}}, {\varvec{k}}; s, t | H | {\varvec{Q}}, {\varvec{k}}’; s’, t’\rangle \Phi _{{\varvec{Q}}, {\varvec{k}}’}^{s’,t’} = {\varepsilon _{{\varvec{Q}}}} \Phi _{{\varvec{Q}}, {\varvec{k}}}^{s,t} ~. \end{aligned}$$
(21)
The matrix elements of the total Hamiltonian, including the electron-hole interaction, read
$$\begin{aligned} \langle {\varvec{Q}}, {\varvec{k}}; s, t | H | {\varvec{Q}}, {\varvec{k}}’; s’, t’\rangle&= \big [\epsilon _{c}({\varvec{Q}}/2 + {\varvec{k}}) – \epsilon _{v}({\varvec{Q}}/2 – {\varvec{k}})\big ] \delta _{{\varvec{k}},{\varvec{k}}’} \delta _{s,s’} \delta _{t,t’} \nonumber \\&\quad + \frac{1}{V} \big [ V^{\textrm{D}} _{s,t;s’,t’}({\varvec{Q}}; {\varvec{k}}, {\varvec{k}}’) – V^{\textrm{X}}_{s,t;s’,t’}({\varvec{Q}}; {\varvec{k}}, {\varvec{k}}’) \big ] ~. \end{aligned}$$
(22)
We have solved this equation with both the bare two-dimensional Coulomb potential \(V^{\textrm{C}}(\varvec{q})\) with the dielectric constant of the surrounding medium \(\varepsilon _{s}\), and with the Rytova–Keldysh potential \(V^{\textrm{RK}}(\varvec{q})\), which also takes into account the dielectric constant \(\varepsilon _{d}\) of Bi2Se3 and is typically used in a nanosheet geometry. The explicit forms of these potentials in momentum space read
$$\begin{aligned} V^{\textrm{C}}(\varvec{q})&= {-}\frac{e^{2}}{2 \varepsilon _{0} \varepsilon _{s} q} ~, \end{aligned}$$
(23)
$$\begin{aligned} V^{\textrm{RK}}(\varvec{q})&= {-}\frac{e^{2}}{2 \varepsilon _{0} \varepsilon _{s}} \frac{1}{q(1 + r_{0}q)} ~, \end{aligned}$$
(24)
where \(r_{0} = (\varepsilon _{d}/2 \varepsilon _{s}) L_{z}\) is a screening length that depends on the dielectric constants of Bi2Se3 and the surrounding environment. We have chosen the relative permittivity \({\varepsilon _{s} = 6}\), which is a typical low value for the environment71. The bulk dielectric constant has been set to \({\varepsilon _{d} = 28}\) in accord with recent first-principles studies of rhombohedral Bi2Se3 in the near-infrared region at 6 QLs72,73.
Our approach implies that we neglect the effects of quantum confinement on the classical electrostatic interaction. This is acceptable if the in-plane separation of the bound electron and hole is larger than the nanosheet thickness, as in this case the electric field lines will mostly lie in the surrounding environment. We find that the long-wavelength Coulomb interaction alone cannot accurately model excitons in bismuth selenide nanosheets because the resulting exciton diameters of the low-lying states are in fact smaller than the slab thickness, as seen in Table 1. Coincidentally, the Rytova–Keldysh interaction closely resembles the quantum-confined Keldysh potential at all momenta74, which is in any case guaranteed to accurately model the electrostatic interaction in this geometry. For this reason, it is not important which of these potentials enters Eqs. (19) and (20), and we have additionally checked that the difference in binding energies obtained with the Rytova–Keldysh and the quantum-confined Keldysh potentials are no larger than \({1}\,{\hbox {meV}}\), leading to errors lower than \(10\%\). Note, however, that fully incorporating the quantum confinement would in principle lead to electrostatic interactions between subspaces with different spin-orbit parity due to the small overlap of the wave functions in the z-direction. This may have a small effect on the Rytova–Keldysh ground-state excitons, which are slightly smaller than the nanosheet thickness, but we note that the dielectric constant we have employed for the surrounding environment is relatively low and using a higher value would also lead to bigger excitons and thus mitigate this issue.
Additionally, we must compare the binding energies of the excitons with the splitting of the first two bulk subbands due to confinement, as we have neglected the subspaces of higher energy. Our approximation of projecting the 3D Hamiltonian onto the bulk subbands closest to the Fermi surface will be justified if the former is smaller than the latter. A binding energy larger than the bulk-subband splitting would imply the need to include the wave functions of higher excited states and thus effectively hinder the two-dimensional treatment of the problem. The splittings of the conduction and valence subbands for \(L_{z} = {6}\,{\hbox {nm}}\) are found to be about \({61}\,{\hbox {meV}}\) and \({24}\,{\hbox {meV}}\), respectively, and the binding energies are given in Fig. 7. Except for the ground state, all states have a binding energy that is smaller than the relevant subband splittings. In the case of the ground state, the binding energy is only around \({1}\,{\hbox {meV}}\) larger than the valence-band splitting, so we do not expect that including the second subband will lead to significant modifications. We conclude that the Rytova–Keldysh potential is apt for our study of excitons in Bi2Se3 nanosheets, especially if we use a higher relative permittivity for the environment. Note, however, that the topology is neither affected by the values of the dielectric constants, nor by the precise form of the interaction potential.
Due to the orthogonality of the states in the different spin-orbit-parity subspaces, the exchange interaction \(V^{\textrm{X}}\) only contributes when the spin-orbit parities of the electron and the hole are equal in both the initial and final states and when \(\varvec{Q}\ne \varvec{0}\), as the inner products in Eq. (20) tend linearly to zero as a function of \(\varvec{Q}\). Thus, specifically, the states \(| \varvec{Q};+,- \rangle\) and \(| \varvec{Q};-,+ \rangle\) are not coupled by the exchange interaction, whereas the states \(| \varvec{Q};+,+ \rangle\) and \(| \varvec{Q};-,- \rangle\) are. We can therefore solve the problem in the subspaces spanned by \(| \varvec{Q};+,+ \rangle\) and \(| \varvec{Q};-,- \rangle\) on the one hand, and by \(| \varvec{Q};\pm ,\mp \rangle\) on the other hand. Since the interaction potentials are, up to complex conjugation, the same for the states \(| \varvec{Q};+,- \rangle\) and \(| \varvec{Q};-,+ \rangle\), the corresponding wave functions also only differ by complex conjugation, and both states are degenerate in energy for all \(\varvec{Q}\). By contrast, the eigenstates spanned by \(| \varvec{Q};+,+ \rangle\) and \(| \varvec{Q};-,- \rangle\) are degenerate in energy only for \(\varvec{Q} = \varvec{0}\).
Derivation of optical properties
The oscillator strength of an exciton \(| \varvec{0}; s, t; n, m \rangle\) reads75,76
$$\begin{aligned} f^{m, \hat{\varvec{e}}}_{s,t;n} = \frac{2}{\varepsilon ^{m}_{s,t;n}} \bigg |\int \textrm{d}^{2} k \, \Phi ^{(m)}_{\varvec{0}; s,t;n} \, e^{i m \phi _{\varvec{k}}} \, \hat{\varvec{e}} \varvec{\cdot } \langle \chi ^{v,t}_{\varvec{k}} | \varvec{v}(\varvec{k}) | \chi ^{c,s}_{\varvec{k}} \rangle \bigg |^{2} , \end{aligned}$$
(25)
where \(\hat{\varvec{e}}\) is the Jones vector of the outgoing beam, \(\varepsilon ^{m}_{s,t;n}\) is the energy of the zero-momentum exciton, and \(\varvec{v}(\varvec{k})\) is the velocity operator. Since only interatomic transitions are expected to play a role in Bi2Se3, the velocity operator is well approximated by \(\varvec{v}(\varvec{k}) \approx \varvec{\nabla }_{\!\varvec{k}} H_{0}(\varvec{k})\)77. The Jones vectors of left- and right-circularly polarized light are \(\hat{\varvec{e}}_{+}\) and \(\hat{\varvec{e}}_{-}\), respectively, with \(\hat{\varvec{e}}_{\pm } = \frac{1}{\sqrt{2}}(\hat{\varvec{x}} \pm i \hat{\varvec{y}})\). Computing the matrix elements that enter the above equation readily yields the reported results.
Effective exciton Hamiltonian
The Hamiltonian of Eq. (4) can be obtained by computing the relevant matrix elements with the relative wave functions for \(\varvec{Q} \rightarrow \varvec{0}\). The phase \(\phi _{\varvec{Q}}\) must be handled with care in this limit by writing \(\Phi _{\varvec{Q},\varvec{k}}^{s,t} \approx \Phi _{\varvec{0},|\varvec{k}|} \, e^{i m (\phi _{\varvec{k}} – \phi _{\varvec{Q}})}\) for excitons with angular momentum m. Another key point is that the symmetries of the potential matrix elements imply
$$\begin{aligned} \langle \varvec{Q}; +,+; n,m | {\hat{V}}^{X}| \varvec{Q};-,-;n,-m\rangle = e^{-2 i \phi _{\varvec{Q}}} \langle \varvec{Q}; +,+; n,m | {\hat{V}}^{X}| \varvec{Q};+,+;n,m\rangle ~, \end{aligned}$$
(26)
which results in the aforementioned coupling, This perturbation scheme is accurate to order \(\varvec{Q}^{2}\), which nevertheless is not enough to break the degeneracy of the odd-m states away from nonzero \(\varvec{Q}\). To capture this effect it is required to include corrections to the zero-momentum wave functions as well.
Computational details
The Bethe-Salpeter equation has been solved with an independently developed MATLAB code that implements the discretization procedure of Ref.78 after rewriting all equations in terms of the dimensionless momentum \(u = k L_{z}\) and keeping only contributions from angular momenta \(|m| \le 3\). The integrals in the matrix elements have been performed up to a momentum cutoff \(U = 10\) and with a discretization \(\Delta u = 0.05\), and we have verified that the results do not depend on the cutoff by performing additional calculations up to \(U = 40\). The long-wavelength divergence of the Coulomb potential has been regularized via an infrared cutoff \(\Delta _{V} u\) chosen such that \(\Delta _{V} u / \Delta u \approx 0.2262\), for which we have checked that the energies do not depend on \(\Delta u\). If \(\Delta _{V} u / \Delta u\) is chosen differently, a well-defined extrapolation of the exciton energy levels for \(\Delta u \rightarrow 0\) can be done by using different values of the discretization.
In order to identify the angular momentum quantum numbers unambiguously after solving the Bethe-Salpeter equation, we choose a nonsingular gauge for the single-particle eigenstates that enter the direct and exchange potentials39,75. We have checked that with this choice the level ordering in the trivial regime reduces to that of the 2D hydrogen atom.