Inspired by an article "Virtual Planet Sleuths" by Gregory P. Laughlin published in October 2006 issue of Sky & Telescope magazine, as astronomy amateurs, we wanted to get a better understanding of processes behind the process of exoplanet search via radial velocity analysis. Practically, we have decided to write an application from scratch which would be automatically analysing radial velocity data sets and provide the user with orbit characteristics of underlying exoplanets orbiting a particular star. ASE, for automatizing the search, applies optimization and minimization algorithms such as GA, MGA (SQP) and similiar. Playing around with publicily available RV data sets produces amazing results. ASE evolves from a simple Excel-based application into a stand-alone application as an amateur approach to astrophysics, astrostatistics, astrodynamics and celestial mechanics.
Currently ASE is an application which allows to:
- Search exoplanet orbital parameters (models) using Genetic Algorithms
- Polish models with Levenberg-Marquardt and Multiobjective Goal Attainment (MGA) -
sequential quadratic programming (SQP) methods
- Quickly search periods with Lomb-Scargle periodogram and averaging filter with automatic Pmin/Pmax search constraint detection by FWHM of the peak
- Execute batch runs to search for the best fit (model)
-
Execute multicomponent GA scans: analyzing residuals and accumulative scans (described below in standard procedures in detail)
- Do quick bootstrap analysis to detect model uncertainities, standard distribution, and build histograms of planet's parameter distributions
- Linear detrending of RV data set
- Orbital stability checks with the following N-body algorithms: second-order mixed-variable symplectic (MVS), general Bulirsch-Stoer, Conservative Bulirsch-Stoer, Everhart's RA15 (RADAU) and Hybrid symplectic/Bulirsch-Stoer integrator - via John E. Chambers MERCURY6, controlled from ASE.
Current ASE build uploaded: 20th November 2006, 21:03 - this is alpha version (may be chaotic, LM polishing not fully working correctly with constraints - PROC 5 MGA [SQP] polishing is recommended as a default procedure.)
1) Download ase.zip (currently around 5 MB) which includes ase.exe - the ASE binary itself, unzip the archive in any folder, say C:\ASE.
2) In order to run ASE binary you will need MATLAB Component Runtime (MCR) - 78 MB file. This is a freely distributable piece of software. You can download MCR here: http://www.bcsatellite.net/ase/MCRInstaller.msi. Download this file and install it. After MCR installation is done, reboot your computer.
3) After reboot, you should be able to run ase.exe and open Systemic .vels files (RV datasets) and work with ASE.
Please notice it takes some time for ASE to unarchive it's data for the first run. This is done only once and the subsequent start times are much shorter.
- Implement FWHM period autodetection constraint with minimum correspond ot Nyquist frequency and maximum to .. ?
- Implement Wisdom-Holman orbital stability check penalties in GA searches (GAMP)
PROC 1: Genetic Algorithm Search. Searches N components (specified in GUI) within the current RV data set. Constraints are specified by GUI. No automatic constraints setup. GA search parameters are fixed at 1000 generations and 0.9 mutation rate. If 4:2:1 resonance checkbox is clicked, ChiSq/L penalizes if the resonance is not met, thus narrowing down the search to components which comply with this rule. Resonance is checked at each iteration. Results in the found model being output in GUI.
PROC 2: Multicomponent GA/Polishing Residuals Scan. Searches only one component at a time, by applying PROC 1 (with N=1) with automatically set sontraints - Pmin/Pmax via FWHM of the strongest periodogram peak, maxK/max Stellar Offset from the RV data set, maxOmega/maxe from the GUI. Once the component is found, the fit is substracted and the new periodogram is built for the residuals and constraints are updated according to the next strongest peak. Procedure continues to search within residuals. Procedure is repeated until three components are found or until there are no significant peaks on the periodogram. At the end, all the components are summed to form an N-component model and PROC 5 is applied to the complete model. Results in the found model being output in GUI and residuals of this model displayed in periodogram.
PROC 3: Batch Run. Applies PROC 1+PROC 5 (standard GA search and polish) or PROC 7 (accumulative GA scan), B amount of times, where B is specified in the GUI (defaults to 100). User chooses to use PROC 1+PROC 5 or PROC 7 in the GUI as well. The output is B amount of models per the current data set, saved in C:\BATCH.ASE file for further analysis. The best model is displayed in GUI.
PROC 4: NLS Polish. Applies polishing technique T, iters amount of times, T (Levenberg-Marquardt or Gauss-Newton) and iters are specified in the GUI. Results in the polished model being output in GUI if it is better than original.
PROC 5: MGA (SQP) Polish. Applies MGA polishing technique only once. Results in the polished model being output in GUI if it is better than original.
PROC 6: Bootstrap analysis. Substracts current fit from the RV data, residuals are scrambled to form B amount of synthetic models by adding scrambled residuals back to the original fit (error bars are saved). For each B synthetic model, takes the original fit and polishes it via PROC 5 and saves the model. Once all synthetic models are analyzed, it derives uncertainities per each of the orbit parameters of the original fit, saves model with uncertainities description in C:\BOOTSTRAP.ASE file for further analysis. After bootstrap is finished, a new windows appear with histograms of distribution of parameter values for each component. Please notice this is quick and dirty bootstrap analysis as this procedure does not try to find the *best* fit per each synthetic model but rather uses the fit which PROC 5 was able to produce.
PROC 7: Multicomponent GA/Polishing Accumulative Scan. Similiar to PROC 2 (uses periodogram to automatically set search period constraints), except that it does not search next components within residuals but in the whole data set at the same time. In this procedure, when the first component is found, a residuals periodogram is built and next component is added into the model with search period defined by FWHM of the peak in periodogram. After that PROC 1 is started from scratch, already by searching two components at the same time. That repeats for each next component. After the last component is found, the whole model is polished by PROC 5. PROC 7 is more efficient and more realistic than PROC 2 because it does not "fixes" components as it finds them (it does not substracts the fit) and allows component's parameters to fluctuate and thus to exchange RV data points between planets - to achieve a better fit at the final model. Results in the found model being output in GUI and residuals of this model displayed in periodogram.
8th October 2006: Sent an introductionary e-mail to Mr. Laughlin with a link to this page.
9th October 2006: Mr. Laughlin came back with a positive reply and the green light for the project. Initial discussion starts.
11th October 2006: Obtained reference to
The Astrophysical Journal, Volume 545, Issue 2, pp. 1044-1057 publication,
The
υ Andromedae System: Models and Stability by Stepinski, Tomasz F. ; Malhotra, Renu ; Black, David C. Publication features concepts used in Systemic
of construction of a stellar reflex
velocity (RV) curve from a planetary configuration in the Keplerian
approximation. Mr. Laughlin has also pointed towards the file locations of RV libraries in Systemic.
17th October 2006: Finished a working sheet in Excel to model RV(t) curves basing on Keplerian orbit characteristics. All values derived correspond to Systemic. Modelled 14 Her visually in Excel and prooved it with Systemic console successfully.
18th October 2006: Two-component model crafted in Excel and commercial PiBlue OptWorks Genetic Algorithm module was purchased and connected in order to examine the behaviour of multicomponent search process with GA.
20th October 2006: Introduced Chi^2red fitness parameter to fully correspond to Systemic's ChiSq. Three-component model finished. Applying such optimization methods as Simulated Annealing and passing derived values to be polished further to Coordinate Pattern Search produced 14
Her three-component model with Chi^2red=2.22. So far it is more optimal to use GA or SA to produce intial values (Chi^2red=~20-100) and pass them to polishing in CPS to reach better fitness (Chi^2red=~2-10), which produces approx up to ten time more precise results in average on 14 Her models. Multiple runs of GA/SA->CPS sequence is required in order to produce better models. A compass CPS search advances in areas where standard CPS brings no improvement. GA setup with 100 populations, 10000 convergence generations, crossover probability 0.7 and mutation probability 0.9 is the most productive.
22th October 2006: Introduced Required Jitter calcualtion which has to be <=5 m/s for most of the main sequence stars.
30th October 2006: Started porting ASE to Matlab7 with features of Genetic Algorithm search, Lomb-Scargle periodogram, Levenberg-Marquardt, Gauss-Newton and Multiobjective Goal Attainment polishing as well as batch run options to generate multiple models automatically.
30th October 2006: Automatically obtained results with 100 models of HD33283, with Pmax set to 150 days, stellar offset from -5 to 5 m/s, maximal K=40 m/s, max e=0.5 m/s. Took approx 3 hours to produce on Athlon 1.4 GHz CPU.
31th October 2006: Added 50 3-companion models of HD33283 on the same page as well as single companion models.
3rd November 2006: Processed Tau Bootis sets from Tom Kaye (Spectrashift Team): the most accurate detrended set models and older set (linear tread removed in ASE). Results show good correlation with published Tau Booties component. An exoplanet can definately be detected with a 400mm telescope!
18th November 2006: Analysis of 14 Her b and 14 Her c.
1) Data set format. RV data sets are usually recorded as a triplets ( t , V , sigma), where t is the time of observation (in Julian days), V is an unaccounted-for component of the star's radial velocity (hereafter referred to as radial velocity) in m/s^-1 , and sigma is a measurement error in m/s^-1. As per Mr. Laughlin's description, "All of the radial velocity datasets are contained in separate files in ascii format. Format is JD, RV, sigma." where JD corresponds to t and RV to V.
2) Simple companion model (heart algorithm of ASE).
(source: The Astrophysical Journal, 545:1044-1057, 2000 December 20) Assume a model consisting of a single companion, labeled B, orbiting the star. Such a one-companion model predicts the radial velocity, V mod, B ( t ), at any given instant of time. The radial velocity signal due to the orbital motion of the star caused by gravitational interaction with a companion is given by the following expression
![]()
where e is the eccentricity of the orbit, f is true anomaly, and omega is its argument of the periastron. The semiamplitude K is proportional to the projected mass of the companion, m sin i , where i is the angle between an observer's line of sight to a star and the normal to the orbital plane of the companion.
The true anomaly, f , can be expressed in terms of the eccentric anomaly, u ,

In turn, eccentric anomaly, u , can be linked to time by means of Kepler's equation,

where P is the period of the companion's orbit and T peri is the time of periastron passage.
3) Multiple companions model. (source: The Astrophysical Journal, 545:1044-1057, 2000 December 20) If we assume two companions, labeled B and D, to orbit the star, then, in the first approximation, the two-companion model is simply given by V mod ( t ) = V mod, B ( t ) + V mod, D ( t ), with individual contributions given in the simple companion model above. Thus, presupposing that the radial velocity signal from star is mostly due to gravitational interactions with multiple companions, the N -companion model can be written as follows:

where R( t ) encapsulates sources of radial velocity signal that cannot be attributed to the presence of companions but instead are intrinsic to the star. They may, in principle, include pulsation and effects due to the inhomogeneous and dynamic nature of the stellar convective and magnetic patterns.
4) Linking radial velocity to time and building radial velocity curve: Tutorial.
In order to link V to t, assuming that Tperi, P, e, omega and K are derived from genetic algorithm and M* (mass of the star hosting the planetary system) is known in advance per it's spectral class and luminosity relationship, the following steps should be executed:
5) Fitness Function
As per The Astrophysical Journal, 545:1044-1057, 2000 December 20, we use the following function with R(t)=0

where Vk is the measured radial velocity, Vmod is the generated curve, sigma is the instrumental error. See steps 11 and 12 in the tutorial above.
6) Genetic Algorithms in Excel: first trial on 14 Her (or re-discovering 14 Her b).
With 14 Her, by using freeware GA module for Excel written by Alexander C. Schreyer, v. 1.2 (Build 1018) (August 16, 2005), and the following settings in the GA module: Tperi >2449465<2452856 day (t0 is the first day of observations, tlast is the last day of observations), Period >1000<2000 days, eccentricity >0.1<0.95, Longitude of periastron >0<360 degrees, Semiamplitude >10<150 m/s. It basically took less than 10 minutes to produce 100 generations (a single Pentium-M 1.4 GHz CPU) with the best fitting orbit of the single "b" companion of 14 Her with the following characteristics:
| 14 Her b: | Value derived by GA, 100 generations, 4 prelim. runs, 10G/run (~10 min) | Value derived by GA, 300 generations, 4 prelim. runs, 10G/run (~23 min) | Value derived by GA, 300 generations, 16 prelim. runs, 10G/run (~34 min) | Value derived by GA (Optworks), 100 generations (~5 min) | Reference* |
| Time of Periastron Passage, Tperi, day | 2449465 | 2449465 | 2451456 | 2441289.76 | 2451368.1 |
| Period, P, days | 1990 | 1870 | 1810 | 1781.91 | 1755.3 |
| Eccentricity, e, (0 to 1) | 0.33263803 | 0.328 | 0.342008 | 0.36293 | 0.3734 |
| Longitude of Periastron, omega, degrees | 35.58670711 | 15.101 | 41.5945623 | 33.6891 | 19.4 |
| Semiamplitude, K, m/s | 86.71200119 | 89.238 | 88.9999609 | 89.2375 | 91.8 |
| Mass of companion, m sin i, Mjup | 5.0634 | 5.1129 | 5.0174 | 4.9626 | 5.07 |
| Mean Anomaly, M(t0), degrees | 359.9268 | 359.9221 | 323.9196 | 327.0778 | - |
| Fitness Function, Chi | 30.3043 (not reduced!) | 23.0860 (not reduced!) | 22.4041 (not reduced!) | 21.5961 (not reduced!) | 2.1 (reduced by L) |
| Stellar Offset, m/s | 0 (was not implemented during the search) | 0 (was not implemented during the search) | 0 (was not implemented during the search) | 0.02932 | Unknown |
* " Catalog of Nearby Exoplanets", 2006, Astrophysical Journal, Vol. 646, Pg. 505 (PDF) (Abstract) R. P. Butler 2 , J. T. Wright 3 , G. W. Marcy 3,4 , D. A Fischer 3,4 , S. S. Vogt 5 , C. G. Tinney 6 , H. R. A. Jones 7 , B. D. Carter 8 , J. A. Johnson 3 , C. McCarthy 2,4 , A. J. Penny 9,10 is taken as a reference.
7) Initial Search Constraints.
Eccentricity. As per "Catalog of Nearby Exoplanets", 2006, Astrophysical Journal, Vol. 646, Pg. 505, the eccentricity is usually distributed between 0 and 0.8, thus by default we limit 0.8>e=<0 to exclude non-realistic models when the curve tries to "reach" RV peaks with high eccentricity planets. The more circular orbits are the better for overall system stability that is.
Period and Time of periastron passage. Resonance: Implemented paramer Pdiff and added constraint into the optimization routine so that P2/P1>=Pdiff and P3/P2 >=Pdiff. This should be always checked during the optimization run. With Pdiff set to 1.5 - period of the second component should be at least 1.5 times longer than the period of the first component, and the period of the third component should be at least 1.5 times longer than the period of the second.
When initializing GA, the definition of range for Tperi should be t0=<Tperi=<(t0+Pmax).
Pmax=tlast-t0 (confirmed by "The minimum frequency (or maximum period) is given by the span of the data itself. In principle, the maximum period you could test for is 5000 days [assuming data set spans for 5000 days] - you could see the data pass through one peak and one trough. However, this assumes that you know in advance that the data is periodic. A more rigorous constraint is that your data should contain at least two cycles. So in this case, the maximum period you could test for is half of the time span, or 2500 days." source: Time-Series Analysis of Astronomical Data by Dr. Matthew Templeton, AAVSO), which means that at least one complete period should be captured in the RV data set. (Systemic users search planets with P>Pmax!)
The goal is to find Pmin and Pmax for the GA search as it would greatly enhance the effectiveness of initial GA search. The following steps are undertaken so far to define Pmin/Pmax doublet:
That will produce Pmin/Pmax search constraint of the FWHM of the most significant peak on periodogram. For example, on HD33283, the highest peak is 18.1111 days. By applying the Pmin/Pmax search procedure above with tcoeff=0.5 (FWHM), this will yield 17.7838-18.4506 days search range into which the best fits of 18.179 days component "b" (as per ASE modelling and exoplanet.eu catalog) fall well. Similiar settings yield 1233.391-3391.826 days window for 14 Her b component which is 1755 days. It makes sense to "secure" the frequencies by substracting x from Pmin and adding x to Pmax (say, x=1 day) to make sure the averaging filter did not brought any noticeable shift to the periods.
Semiamplitude. Suggested min of 0 m/s and maximum to double of the largest star's RV amplitude (2*max(RVds)).
Stellar Offset . Suggested min of -2*max(RVds) and max of 2*max(RVds).
8) Bootstrap Method.
(source: R. Paul Butler,..) The uncertainties ... are based on Monte Carlo realizations of the data. The best-fit radial velocity curve is subtracted from the original velocities, and the residuals are adopted as representative of the amplitude and distribution of velocity noise from all sources, including velocity errors and photospheric jitter. The residuals are permuted and added back to the best-fit radial velocity curve, leaving the times of observation the same. This approach yields many realizations of the set of velocity measurements of the best-fit planet, assuming that the noise distribution is as exhibited by the residuals. A Gaussian distribution of errors was not assumed (but such a simplification yields similar values for the uncertainties in the orbital parameters). Each realization was fitted with a Keplerian model, allowing calculation of the standard deviation of each orbital parameter. These are adopted as the 1 sigma uncertainties... Note that these quoted uncertainties do not incorporate the uncertainty in the mass of the star itself.
Bootstrap tutorial:
1) Once the model M is derived after searches/batches/scans, we shall take the residuals (R) - RVm-RVf.
2) For each synthetic dataset, the R is perturbated (scrambled), resulting in Rperturb. Resulting synthetic dataset RVs=Rperturb+RVf. Important: sigma matrix (error bars) have to be scrambled exactly the same way the residuals are, i.e. each residual value should have it's own error bar attached to it during the scrambling.
3) For each synthetic dataset we shall do *the same* search/batch/scan procedure used to derive model M, in order to derive a set of synthetic models SM. Each synthetic model SM represents the best fit for the corresponding synthetic dataset RVs.
4) Basing on synthetic models SM we can derive uncertainities for each parameter:
Uncertainities are measured as standard deviation, sigma:

where n is number of models, xi is the i-th model and xdash is the mean of all the models, sigma^2 is the dispersion. Mean is calculated as following:

The mean values, xdash of all the synhethic models should not deviate much from the original model values which is being bootstrapped.
Tutorial to calculate uncertainities:
1) Take each parameter, P, Tperi, omega, K, and e and calculate it's mean value among all the synthetic models - xdash.
2) Calculate sigma of each parameter using equations above.
3) Record result as xdash +/- sigma.
9) Stability checks
- For stability checks, Krzysztof Go'zdziewski in ORBITAL CONFIGURATIONS AND DYNAMICAL STABILITY OF MULTI-PLANET SYSTEMS AROUND SUN-LIKE STARS HD 202206, 14 HER, HD 37124 AND HD 108874 uses GAMP: We derive dynamically stable configurations which reproduce the observed RV signals using our method called GAMP (an acronym of the Genetic Algorithm with MEGNO Penalty). The GAMP relies on the N-body dynamics and makes use of genetic algorithms merged with a stability criterion. For this purpose, we use themaximal Lyapunov exponent computedwith the dynamical fast indicator MEGNO. Through a dynamical analysis of the phase-space in a neighborhood of the obtained best-fit solutions, we derive meaningful limits on the parameters of the planets. Without taking into account the stability criterion and due to narrow observational windows, the orbital elements of the outermost planets are barely constrained and both Keplerian, as well as Newtonian, best-fit solutions often correspond to self-disrupting configurations. ... The unstable solutions are penalized by artificially increasing their (Chi2n)^1/2. For determining the character of motions, we rely on the computation of the maximal Lyapunov exponent through the MEGNO indicator (Cincotta & Sim´o 2000; Cincotta et al. 2003; Giordano & Cincotta 2004).
- from Aron S¨uli1⋆, Rudolf Dvorak2 and Florian Freistetter3 in "The stability of the terrestrial planets with a more massive ”Earth”": The conservative definition of the point at which systems become unstable is when close encounter between two planets happens: two bodies approach one another within an area of the larger Hill radius. The consequence of such an event is the dramatic changes in the orbital elements of the two planets, and usually the escape of the planet with smaller mass. In this paper we define a system unstable, when an orbit crossing or a close encounter happens. This definition is omewhat more general and the instability can be directly connected to the eccentricity via the perihelia and aphelia distances. Henceforward we state that our model is dynamically unstable if orbit crossing or close encounters happen in the course of the integration.
- Proposed implementing MERCURY6 code by John E. Chambers. MERCURY is a general-purpose software package for doing N-body integrations. It is designed to calculate the orbital evolution of objects moving in the gravitational field of a large central body. MERCURY is written in Fortran 77. The code is slightly non standard. We have successfully complied the code using Salford Software FTN77 compiled under Win32. MERCURY currently includes the following N-body algorithms:
o A second-order mixed-variable symplectic (MVS) algorithm incorporating simple symplectic correctors (see J.Wisdom et al. 1996, Fields Instit. Commun. vol 10 pp217) - this is very fast but it cannot compute close encounters between objects.
o A general Bulirsch-Stoer - slow but accurate in most situations.
You can use this when all else fails, or to test whether the other
algorithms are appropriate for the problem you want to study.
o Conservative Bulirsch-Stoer - twice as fast as the general BS
routine, but it will only work for conservative systems, in which
accelerations are a function of position only (e.g. Newtonian
gravity, but not General Relativity).
o Everhart's RA15 (RADAU) - about 2-3 times faster than the general version of Bulirsch-Stoer. Usually reliable, except for very close encounters or very eccentric (e.g. Sun grazing) orbits.
o Hybrid symplectic/Bulirsch-Stoer integrator - very fast but only
moderately accurate. This algorithm can compute close encounters. This one will be used by default by ASE for stability checks. Proposed test timeframe of 10^4 amount of the longest period, in case of 14 Her c that will be approx 250'000 years with a timestep of 356 days.
1) Genetic Algorithm for Matlab - by Andrey Popov, Developed as part of Thesis work: “Genetic Algorithms for Optimization – Application in Controller Design Problems”,TU-Sofia, 2003.
2) Lomb-Scargle Periodogram Realisation for Matlab - written by Brett Shoelson, Ph.D., 3/1/1999. Last modification: 10/25/01, further modified to fit the specifics of exoplanet search tasks in ASE.
3) Systemic - online collaborative project for RV data sets and SpectraShift - amateur exoplanet search project for Tau Boo RV datasets.
4) MERCURY6 - J.E.Chambers (1999) ``A Hybrid Symplectic Integrator that Permits Close Encounters between Massive Bodies''. Monthly Notices of the Royal Astronomical Society, vol 304, pp793-799.
Maxim Usatov and Polina Usatova,
E-mail: maxim.usatov@bcsatellite.net, polina.usatova@bcsatellite.net
Dniepropetrovsk, Ukraine
2006