Citation: | Ma TG, Wang HZ, Guo LJ. OptoGPT: A foundation model for inverse design in optical multilayer thin film structures. Opto-Electron Adv 7, 240062 (2024). doi: 10.29026/oea.2024.240062 |
[1] | Yang ZM, Ji CG, Liu D et al. Enhancing the purity of reflective structural colors with ultrathin bilayer media as effective ideal absorbers. Adv Opt Mater 7, 1900739 (2019). doi: 10.1002/adom.201900739 |
[2] | Wang DY, Liu ZY, Wang HZ et al. Structural color generation: from layered thin films to optical metasurfaces. Nanophotonics 12, 1019–1081 (2023), doi: 10.1515/nanoph-2022-0063 |
[3] | Ji CG, Yang CY, Shen WD et al. Decorative near-infrared transmission filters featuring high-efficiency and angular-insensitivity employing 1D photonic crystals. Nano Res 12, 543–548 (2019). doi: 10.1007/s12274-018-2249-8 |
[4] | Li WW, Xu MZ, Xu HX et al. Metamaterial absorbers: from tunable surface to structural transformation. Adv Mater 34, 2202509 (2022). doi: 10.1002/adma.202202509 |
[5] | Hu JQ, Wang ZR, Kim S et al. Polariton laser in the bardeen-cooper-schrieffer regime. Phys Rev X 11, 011018 (2021). |
[6] | Fink Y, Winn JN, Fan SH et al. A dielectric omnidirectional reflector. Science 282, 1679–1682 (1998). doi: 10.1126/science.282.5394.1679 |
[7] | Li ZY, Butun S, Aydin K. Large-area, lithography-free super absorbers and color filters at visible frequencies using ultrathin metallic films. ACS Photonics 2, 183–188 (2015). doi: 10.1021/ph500410u |
[8] | Liu MZ, Johnston MB, Snaith HJ. Efficient planar heterojunction perovskite solar cells by vapour deposition. Nature 501, 395–398 (2013). doi: 10.1038/nature12509 |
[9] | Raman AP, Anoma MA, Zhu LX et al. Passive radiative cooling below ambient air temperature under direct sunlight. Nature 515, 540–544 (2014). doi: 10.1038/nature13883 |
[10] | Wang SC, Jiang TY, Meng Y et al. Scalable thermochromic smart windows with passive radiative cooling regulation. Science 374, 1501–1504 (2021). doi: 10.1126/science.abg0291 |
[11] | So S, Yun J, Ko B et al. Radiative cooling for energy sustainability: from fundamentals to fabrication methods toward commercialization. Adv Sci 11, 2305067 (2024). doi: 10.1002/advs.202305067 |
[12] | Rabady RI, Ababneh A. Global optimal design of optical multilayer thin-film filters using particle swarm optimization. Optik 125, 548–553 (2014). doi: 10.1016/j.ijleo.2013.07.028 |
[13] | Tikhonravov AV, Trubetskov MK, DeBell GW. Application of the needle optimization technique to the design of optical coatings. Appl Opt 35, 5493–5508 (1996). doi: 10.1364/AO.35.005493 |
[14] | Schubert MF, Mont FW, Chhajed S et al. Design of multilayer antireflection coatings made from co-sputtered and low-refractive-index materials by genetic algorithm. Opt Express 16, 5290–5298 (2008). doi: 10.1364/OE.16.005290 |
[15] | Shi Y, Li W, Raman A et al. Optimization of multilayer optical films with a memetic algorithm and mixed integer programming. ACS Photonics 5, 684–691 (2018). doi: 10.1021/acsphotonics.7b01136 |
[16] | Ha YL, Luo Y, Pu MB et al. Physics-data-driven intelligent optimization for large-aperture metalenses. Opto-Electron Adv 6, 230133 (2023). doi: 10.29026/oea.2023.230133 |
[17] | Liu DJ, Tan YX, Khoram E et al. Training deep neural networks for the inverse design of nanophotonic structures. ACS Photonics 5, 1365–1369 (2018). doi: 10.1021/acsphotonics.7b01377 |
[18] | Dai P, Sun K, Yan XZ et al. Inverse design of structural color: finding multiple solutions via conditional generative adversarial networks. Nanophotonics 11, 3057–3069 (2022). doi: 10.1515/nanoph-2022-0095 |
[19] | Unni R, Yao K, Zheng YB. Deep convolutional mixture density network for inverse design of layered photonic structures. ACS Photonics 7, 2703–2712 (2020). doi: 10.1021/acsphotonics.0c00630 |
[20] | Wang HZ, Guo LJ. NEUTRON: neural particle swarm optimization for material-aware inverse design of structural color. iScience 25, 104339 (2022), doi: 10.1016/j.isci.2022.104339 |
[21] | Chen W, Gao Y, Li YY et al. Broadband solar metamaterial absorbers empowered by transformer-based deep learning. Adv Sci 10, 2206718 (2023). doi: 10.1002/advs.202206718 |
[22] | Lee S, Park C, Rho J. Mapping information and light: trends of AI-enabled metaphotonics. Curr Opin Solid State Mater Sci 29, 101144 (2024). doi: 10.1016/j.cossms.2024.101144 |
[23] | Ma TG, Tobah M, Wang HZ et al. Benchmarking deep learning-based models on nanophotonic inverse design problems. Opto-Electron Sci 1, 210012 (2022). doi: 10.29026/oes.2022.210012 |
[24] | Vaswani A, Shazeer N, Parmar N et al. Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems 6000–6010 (Curran Associates Inc. , 2017); http://doi.org/10.5555/3295222.3295349. |
[25] | Bommasani R, Hudson DA, Adeli E et al. On the opportunities and risks of foundation models. arXiv: 2108.07258 (2022). https://doi.org/10.48550/arXiv.2108.07258 |
[26] | Radford A, Narasimhan K, Salimans T et al. Improving language understanding by generative pre-training. (2018). |
[27] | Brown TB, Mann B, Ryder N et al. Language models are few-shot learners. In Proceedings of the 34th International Conference on Neural Information Processing Systems 159 (Curran Associates Inc. , 2020); http://doi.org/10.5555/3495724.3495883. |
[28] | Ouyang L, Wu J, Jiang X et al. Training language models to follow instructions with human feedback. In Proceedings of the 36th International Conference on Neural Information Processing Systems 2011 (Curran Associates Inc. , 2022); http://doi.org/10.5555/3600270.3602281. |
[29] | Byrnes SJ. Multilayer optical calculations. arXiv: 1603.02720 (2020). https://doi.org/10.48550/arXiv.1603.02720 |
[30] | Hinton G, Roweis S. Stochastic neighbor embedding. In Proceedings of the 15th International Conference on Neural Information Processing Systems 857–864 (MIT Press, 2002); http://doi.org/10.5555/2968618.2968725. |
[31] | Taylor R, Kardas M, Cucurull G et al. Galactica: a large language model for science. arXiv: 2211.09085 (2022). https://arxiv.org/abs/2211.09085 |
[32] | Reed S, Zolna K, Parisotto E et al. A generalist agent. arXiv: 2205.06175 (2022). |
[33] | Driess D, Xia F, Sajjadi MSM et al. PaLM-E: an embodied multimodal language model. In Proceedings of the 40th International Conference on Machine Learning 340 (JMLR. org, 2023); http://doi.org/10.5555/3618408.3618748. |
[34] | Han X, Fan ZY, Liu ZY et al. Inverse design of metasurface optical filters using deep neural network with high degrees of freedom. InfoMat 3, 432–442 (2021). doi: 10.1002/inf2.12116 |
[35] | Unni R, Yao K, Han XW et al. A mixture-density-based tandem optimization network for on-demand inverse design of thin-film high reflectors. Nanophotonics 10, 4057–4065 (2021). doi: 10.1515/nanoph-2021-0392 |
[36] | Slobodkin Y, Weinberg G, Hörner H et al. Massively degenerate coherent perfect absorber for arbitrary wavefronts. Science 377, 995–998 (2022). doi: 10.1126/science.abq8103 |
[37] | Lord J, Thomas A, Treat N et al. Global potential for harvesting drinking water from air using solar energy. Nature 598, 611–617 (2021). doi: 10.1038/s41586-021-03900-w |
[38] | Teperik TV, García De Abajo FJ, Borisov AG et al. Omnidirectional absorption in nanostructured metal surfaces. Nat Photon 2, 299–301 (2008). doi: 10.1038/nphoton.2008.76 |
[39] | Nga DT, Phan AD, Lam VD et al. Optimizing the design of broadband solar metamaterial absorbers based on titanium nitride nanorings [Invited]. Opt Mater Express 13, 2787–2797 (2023). doi: 10.1364/OME.499630 |
[40] | Yang CY, Ji CG, Shen WD et al. Compact multilayer film structures for ultrabroadband, omnidirectional, and efficient absorption. ACS Photonics 3, 590–596 (2016). doi: 10.1021/acsphotonics.5b00689 |
[41] | Tan SJ, Zhang L, Zhu D et al. Plasmonic color palettes for photorealistic printing with aluminum nanostructures. Nano Lett 14, 4023–4029 (2014). doi: 10.1021/nl501460x |
[42] | Yang WH, Xiao SM, Song QH et al. All-dielectric metasurface for high-performance structural color. Nat Commun 11, 1864 (2020). doi: 10.1038/s41467-020-15773-0 |
[43] | Song MW, Li X, Pu MB et al. Color display and encryption with a plasmonic polarizing metamirror. Nanophotonics 7, 323–331 (2018). doi: 10.1515/nanoph-2017-0062 |
[44] | Balaur E, O’Toole S, Spurling AJ et al. Colorimetric histology using plasmonically active microscope slides. Nature 598, 65–71 (2021). doi: 10.1038/s41586-021-03835-2 |
[45] | Gao L, Li XZ, Liu DJ et al. A bidirectional deep neural network for accurate silicon color design. Adv Mater 31, 1905467 (2019). doi: 10.1002/adma.201905467 |
[46] | Lee C, Lee S, Seong J et al. Inverse-designed metasurfaces for highly saturated transmissive colors. J Opt Soc Am B 41, 151–158 (2024). doi: 10.1364/JOSAB.505444 |
[47] | Dosovitskiy A, Beyer L, Kolesnikov A et al. An image is worth 16x16 words: transformers for image recognition at scale. In Proceedings of the 9th International Conference on Learning Representations (OpenReview. net, 2021). |
Supplementary information for OptoGPT: A foundation model for inverse design in optical multilayer thin film structures |
The schematic of using the Opto-Generative Pretrained Transformer (OptoGPT) to design multilayer thin film structures. (a) and (b) show the diagram of general GPT model in NLP and our OptoGPT, respectively. For GPT, the model takes in the input prompts and generate answers from probability sampling in an auto-regressive way. In OptoGPT, the input prompts are the optical targets while the outputs are designed multilayer structures. (c) Different types of inputs that relates to different application situations, including structural color, absorbers, filters, distributed bragg reflectors (DBR), Fabry–Pérot (FP) resonator and other arbitrary spectrum targets. All of them are converted to reflection and transmission spectrum. (d) One example of the “structure serialization” for a N-layer structure on the glass substrate. This N-layer structure is serialized by N+1 tokens.
Details of OptoGPT model. (a) The model architecture of our OptoGPT, which is a decoder-only transformer. More details can be found in SI 1.3. (b) The working diagram of our OptoGPT model. (c) The diagram of the auto-regressive design process. When designing for ith layer, we sample from the probability output to obtain the layered information. This design process will keep going until reaching the maximum layer of 20 or ‘EoS’ is sampled.
2D visualization of the hidden space using t-SNE to reduce dimension. (a) The working diagram of using t-SNE to reduce dimension of structure embeddings and spectrum embeddings to 2D. (b) Visualization of t-SNE for 900 structure tokens and 1,000 spectra randomly selected from the validation dataset. Spectra are marked as green cross and structure tokens are marked as colorful dots, where different color corresponds to different materials. The green dashed circle illustrates the approximated boundary between spectra and structures. Inside this boundary are the spectra, with examples of two different spectra (marked as red cross) given in (iii) and (iv). Outside the green boundary are structure tokens corresponding to different material and thickness combinations. These structure tokens with the same materials either form a line shape or cluster together. For each line, the dot size is monotonically decreasing from one end to the other end, corresponding to the monotonical thickness decrease from 500 nm to 10 nm. Most lines converge into two regions, with zoom-in details given in (i) and (ii) corresponding to low refractive and high refractive index region, respectively. Our model demonstrates the ability of learning the material and thickness from a large dataset without their explicit inputs.
Results of inverse design performance on the validation dataset. (a) The Mean Absolute Error (MAE) on 1,000 random spectrum targets from the validation dataset. The orange, blue and red dots correspond to closest structures in training dataset, designed structures and finetuned structures. Their averaged MAE are 0.0296, 0.0258, 0.0192, respectively. (b) The number of layers in the target structure v.s. the number of layers in the designed structure. On average, the designed structures have 6 fewer layers than the target structure. (c) Time comparison of forward simulation using TMM and inverse design using OptoGPT (without finetuning). Results are averaged on 1,000 random targets in the validation dataset, showing that our model makes inverse design as fast as doing a TMM simulation. (d) One inverse design example from the validation dataset. The table below gives the details of five designed structures and the finetuned structure as well as their spectrum MAE.
Examples of inverse design artificial spectra in different applications. (a) Design for band-notch filter at 550 nm. (b) Design for high reflection in NIR. (c) Design for perfect absorber. (d) Design for arbitrary absorber. Here, solid lines, dotted lines and squared lines correspond to the spectrum of artificial target, the spectrum of the closest structure in the training dataset and the spectrum of designed structure from our model (with thickness finetuning), respectively. (e–f) shows the example of designing reflective and transmissive structural color, respectively. We use the color difference of ΔE to evaluate the design performance (smaller ΔE means smaller color difference). For each color, the first brick, second brick, and third brick correspond to the target color, closest color in the training dataset, and designed color from our mode (with thickness finetuning), respectively. More details and examples can be found in section 2 in SI.
Illustration of design flexibility. (a) A visualization of the design process when adding the design constraint. We use the example of “remove Ag from material selection at ith layer”. When designing the desired ith layer, we remove these tokens that do not satisfy constraints from probability distribution and only sample from the renormalized probability based on remaining tokens. (b–e) Comparison of the spectrum performance with different constraints, respectively. The solid lines and squared lines are the target spectrum and the spectrum of the designed structure with different constraints, respectively. More examples of design flexibility can be found in section 3 in SI.
Design performance on different angles and polarization. (a) The diagram of finetune. (b–g) gives inverse design examples for spectrum with 20° s-polarization, 60° s-polarization, 10° p-polarization, 50° p-polarization, 30° unpolarized and 50° unpolarized, respectively. The solid line, dashed line and squared line correspond to the target spectrum, spectrum designed by the pretrained model and spectrum designed by the finetuned model, respectively.
Simultaneous design on different angles and polarization. (a) The diagram of mixed sampling for simultaneous design. (b) One example of designing angle-robust spectrum for 0°, 20° and 40° unpolarized light.