Richard E.L. Higgins^{1}, David F. Fouhey^{1}, Dichang Zhang^{1}, Spiro K. Antiochos^{2}, Graham Barnes^{3}, Todd Hoeksema^{4}, KD Leka^{3}, Yang Liu^{4}, Peter W. Schuck^{2}, Tamas I. Gombosi^{5}

1 Computer Science and Engineering, University of Michigan, Ann Arbor, MI 48109

2 NASA Goddard Space Flight Center, Greenbelt, MD 20771

3 NorthWest Research Associates, Boulder, CO 80301

4 HEPL, Stanford University, Stanford, CA 94305

5 Climate and Space Science Engineering, University of Michigan, Ann Arbor, MI 48109

**Abstract** The magnetogram products produced by HMI^{[1]} and its analysis pipeline are the result of a per-pixel optimization that estimates solar atmospheric parameters and minimizes disagreement between a synthesized and observed Stokes vector. We introduce a deep learning-based approach that can emulate the existing HMI pipeline results two orders of magnitude faster than the current pipeline algorithms. Our system is a U-Net^{[2]} trained on input Stokes vectors and their accompanying optimization-based VFISV^{[3]} inversions. We demonstrate that our system, once trained, can produce high-fidelity estimates of the magnetic field and kinematic and thermodynamic parameters while also producing meaningful confidence intervals. We additionally show that despite penalizing only per-pixel loss terms, our system is able to faithfully reproduce known systematic oscillations in full-disk statistics produced by the pipeline. This emulation system could serve as an initialization for the full Stokes inversion or as an ultra-fast proxy inversion.

Figure 1| Our approach for emulating the SDO/HMI Stokes vector inversion pipeline, VFISV. As input, our network takes Stokes vector measurements (IQUV), metadata, and estimates of the continuum intensity. As output it produces a per-pixel estimate of a single parameter of the inversion as would be produced by VFISV, e.g., inclination. We cast the problem as regression-by-classification over a discrete set of bins. We show that this is both accurate and enables fast uncertainty quantification.

**Introduction** Our emulation of the VFISV Stokes Inversion trains a deep network (U-Net) to map directly from IQUV polarized light (the hmi.S_720s series) to Milne-Eddington magnetic field parameters. In particular, the system aims to predict eight targets: Field (Mx/cm^{2}), Inclination (degrees), Azimuth (degrees), LOS Velocity (cm/s), Doppler Width (mÅ), η0 (dimensionless), S0 (Data Number/s), S1 (Data Number/s).

The method is structured as regression-via-classification^{[4]}, where the system predicts a K=80 dimensional vector of logits for every (x,y) point in the image, rather than directly predicting a regressed value. The predicted logits correspond to linearly splitting up the range of potential target values into bins that span the range, e.g., the 0°-180° for inclination is split into 80 bins of around ~2.25°, with the first bin representing 0°-~2.25°. The VFISV target values are represented similarly as K-dimensional vectors and the entire system is trained by minimizing a KL-divergence loss between both distributions. This regression-via-classification approach lends itself naturally to quantifying uncertainty in the network as clustered probability mass represents confidence while spread probability mass represents uncertainty. Our method is fast, requiring only 4.8s per target on a consumer GPU.

Figure 2| Qualitative results for full disk and a few active regions on held-back data corresponding to observations at 2016/05/10-06:48:00 TAI. The predicted full disk images for magnetic field strength, inclination, and azimuth are generated by the proposed approach. We show cutouts for a few different areas – an active region towards the east limb, an active region in the center of the disk, and plage to the west of disk center. Field strength and inclination estimates are generally precise in regions of both moderate and weak polarization. Azimuth, on the other hand, is poorly constrained in areas of weak linear polarization. However, the azimuth-angle predictions from VFISV are also poorly constrained in such areas, therefore it is consistent that the proposed emulation method is similarly noisy.

**Results** We train instances of the system, i.e., fit parameters of the neural network, on solar disks sampled from the first 60% of 2015. We then validate the model’s performance on data never encountered during training, sampled from the remaining 40% of 2015. Finally, all evaluation, metrics, figures, and results are calculated from test data consisting of solar disks sampled from the entirety of 2016.

Across the solar disk, our emulated results for the field have a mean absolute error (MAE) of 9.67 Mx/cm^{2}. Inclination is similarly well-estimated with an MAE of 0.58°. Azimuth has an MAE of 13.06°. This error mixes weak-field regimes where the azimuth is poorly defined with regions where the azimuth is better defined. If we re-evaluate in pixels with a moderate field strength of >100Mx/cm^{2}, azimuth MAE drops to 8.58°. Other parameters are similarly well estimated.

Figure 3| Average on-disk field strength and deviation from horizontal as a function of time over a two-week period with 24-hour periods separated by black vertical lines. The proposed system faithfully recreates known periodic behavior of the current SDO/HMI pipeline.

VFISV produces per-pixel uncertainty estimations, and beyond accuracy, we believe a good emulation should have uncertainties that correlate to VFISV. Except for field strength, we find reasonably good agreement for the Spearman’s ρ calculated between the VFISV’s pipeline uncertainty and the width of our estimated 90% confidence intervals.

We further conduct a series of ablations, showing that regression-via-classification, using relatively deep networks, and removing Batch-Norm5 are all important for performance. Finally, we show that the network’s emulations capture an aberrant periodic oscillation in VFISV’s output.

Overall, the accuracy of this method suggests that it could serve as a warm-start for VFISV or as a pre-disambiguation stand-in. Furthermore, the emulation can enable a more rapid analysis of the periodic artifacts present in VFISV, potentially leading to their correction and removal from the pipeline.

For more details and open-source codes, please refer to our project website https://relh.github.io/FAE-HMI-SI/ .

### References

[1] Schou, J., Scherrer, P. H., Bush, R. I., et al. 2012, *Solar Phys.*, **275**, 229

[2] Ronneberger, O., Fischer, P., & Brox, T. 2015, in International Conference on Medical Image Computing and Computer-Assisted Intervention

[3] Borrero, J., & Kobel, P. 2011, *A&A*, **527**, A29

[4] Ladick ́y, L., Zeisl, B., & Pollefeys, M. 2014, in ECCV

[5] Ioffe, S., & Szegedy, C. 2015, arXiv:1502.03167