{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "651cff77",
   "metadata": {},
   "source": [
    "### What this script does \n",
    "Basically synthesis and analysis of data from all instruments (in-situ and remote sensing, also combining data from other 3 stations in the Netherlands)\n",
    "\n",
    "- loads CO₂ flux datasets from three stations (Veenkampen, Loobos, Amsterdam), converts µmol m⁻² s⁻¹ → ppm·m·s⁻¹, and plots time series.\n",
    "\n",
    "- loads a 10-minute merged sonic/mast/radiometer dataset from a date-based folder to continue (LCL work comes next).\n",
    "- plots multiple LCL estimates through the day:\n",
    "\n",
    "    - black markers: LCL from temperature at 2 m (LCL column’s first element per row).\n",
    "\n",
    "    - red markers: LCL computed from qv with T at 2 m (LCL_qv_T_2.0m).\n",
    "\n",
    "    - green markers: Romps LCL already in km (LCL_romps_km).\n",
    "\n",
    "    - orange line: CBL height from the parcel method (zi_parcel, converted to km).\n",
    "\n",
    "    - shades periods where LCL ≤ CBL (suggesting a possible cloud layer).\n",
    "\n",
    "- Sonic–mast comparisons: converts timestamps and makes two time-series scatter plots:\n",
    "\n",
    "    - Temperature_K_2.99 vs Average_Temperature_Corr\n",
    "\n",
    "    - qv_2.99m vs qv_sonic\n",
    "\n",
    "    - Merge MWR 10-min data: loads iwv_lwp_10min_avg.csv from the microwave radiometer folder and merges it into   merged_data_10min (nearest‐time join).\n",
    "\n",
    "- Plot LWP: creates a presentation-style plot of LWP_Corrected vs time and saves it.\n",
    "\n",
    "- Surface state from LW↑: derives T_srf, esat_srf, and qsat_srf.\n",
    "\n",
    "- Load Cloud Radar products: loads 10-min rain CSV and the 10-min vertical dataset (Parquet), suffixes CR columns, merges to the main dataframe.\n",
    "\n",
    "- Quality flags: adds Flag (|ΔT|>1 K) and Flag_Rain (rain>0) and plots a flagged temperature scatter.\n",
    "\n",
    "- Outputs: several PNGs saved into the appropriate day folders.\n",
    "\n",
    "\n",
    "- Loads vertical profiles: reads RH/AH 10-min profiles (profiles_data_10min_avg.parquet) and MWR vertical dataset (MWR_vertical_dataset_10min.parquet) for the chosen day.\n",
    "\n",
    "- Fuse with mast data: for each profile timestamp, pulls same-time mast RH and qv at fixed heights and builds a combined row.\n",
    "\n",
    "- Plot: makes side-by-side plots comparing radiometer vs mast RH (and later temperature) profiles at a chosen time index.\n",
    "\n",
    "- Temperature profiles: loads vertical_temperature_profiles_10min_avg.parquet, combines with mast temperatures and VDSE, and plots.\n",
    "\n",
    "- All-in merge: merges RH/AH combo, MWR vertical, and temperature combo into df_combined_all, then saves a figure.\n",
    "\n",
    "- Classifies cloud type per time step using radiometer RH profile, LWP, two flags (instrument/rain), and a clearness index CSI.\n",
    "\n",
    "- Saves the cloud classification to Excel and produces per-type profile plots (RH & temperature) into subfolders.\n",
    "\n",
    "- Merges your per-timestamp profile bundle (df_combined_all) into the 10-min base table (merged_data_10min) on Time.\n",
    "\n",
    "- Provides three plotting helpers:\n",
    "\n",
    "    - plot_temperature_profiles(...): compares T and θᵥ from MWR, Cloud Radar (CR), and Mast.\n",
    "\n",
    "    - plot_rh_profiles(...): compares RH (primary x-axis) and AH (secondary x-axis) for MWR, CR, and Mast.\n",
    "\n",
    "    - plot_all_profiles(...): a 3-panel figure for T/θᵥ, RH/AH, and qᵥ side-by-side; also draws CBL height and LCL lines if present.\n",
    "\n",
    "\n",
    "- Exports a 1-page PDF listing all columns in merged_data_10min.\n",
    "\n",
    "- Computes basic bulk‐flux estimates (u_star, Cd, SHF_bulk, LHF_bulk) and checks sign agreement with measured SHF/LHF.\n",
    "\n",
    "- Plots time series for SHF/LHF (measured vs bulk).\n",
    "\n",
    "- Builds diurnal plots for temperatures, dry static energy, virtual dry static energy, qv, wind speed, radiation components, momentum fluxes, and CO₂.\n",
    "\n",
    "- Computes/plots vertical gradients (lapse rates) of 𝑠𝑑𝑟𝑦,𝑣/𝑐𝑝 and marks zero crossings.\n",
    "\n",
    "- LW↓ vs IWV: plots downwelling longwave radiation against integrated water vapor and saves the scatter plot.\n",
    "\n",
    "- SW↓ vs LWP: plots downwelling shortwave irradiance against liquid water path (LWP) and saves it.\n",
    "\n",
    "- LWP vs Time: Shows how LWP changes over time in a time-series plot.\n",
    "\n",
    "- Temperature (2.99 m) vs Sonic: Compares mast temperature (at 2.99 m) with sonic temperature over time.\n",
    "\n",
    "- Wind (2.99 m) vs Sonic: Compares mast wind speed with sonic wind speed and saves the time plot.\n",
    "\n",
    "- Wind Speed Flags & Scatter: Marks and plots wind speed outliers (ΔWS > 1 m/s) and rain periods on a scatter plot.\n",
    "\n",
    "- SEB Scatter Plots: Plots four energy balance components (SHF, LHF, Net Radiation, G, F_CO₂) versus time (filtered by quality flags).\n",
    "\n",
    "- 30-min Flux Data: Loads the 30-minute averaged flux dataset for further comparisons.\n",
    "\n",
    "- External Stations Data: Loads and cleans flux, soil, and radiation data from Loobos, Amsterdam, and Veenkampen reference sites:\n",
    "\n",
    "    - SHF Comparison: Compares sensible heat flux (H) from all stations with your measured SHF.\n",
    "\n",
    "    - LHF Comparison: Compares latent heat flux (LE) from all stations with your measured LHF.\n",
    "\n",
    "    - CO₂ Flux Comparison: Compares CO₂ flux from all stations with your measured F_CO₂.\n",
    "\n",
    "    - Soil Fluxes: Plots soil heat flux (G) probes from Loobos and Veenkampen alongside your measured G.\n",
    "\n",
    "    - Net Radiation Comparison: Compares net radiation from all sites (with consistent sign convention) against your measured Net Rad.\n",
    "\n",
    "    - Residual Ground Heat Flux: Computes residual G = −(H + LE + Net Rad) for each site and compares with your measured G.\n",
    "  \n",
    "#### Lines you should change before running\n",
    "\n",
    "- other stations data folder:\n",
    "    - data_vl = r\"C:\\path\\to\\your\\Fluxes_Other_Stations_May_20_26\"\n",
    "    - vl_file  = os.path.join(data_vl, \"Veenkampen_Fluxes.csv\")\n",
    "    - loo_file = os.path.join(data_vl, \"Loobos_Fluxes.csv\")\n",
    "    - ams_file = os.path.join(data_vl, \"Amsterdam_Fluxes.csv\")\n",
    "\n",
    "- 10-minute dataset folder and date of interest:\n",
    "    - date_str  = \"2024-05-23\"           ← change date here\n",
    "    - month_str = date_str[:7]\n",
    "    - data_dir  = rf\"C:\\path\\to\\your\\Sonic\\{month_str}\\{date_str}\"\n",
    "    - input_file = os.path.join(data_dir, \"merged_data_10min.csv\")\n",
    "\n",
    "\n",
    "- Base directory containing CBL height outputs from the microwave radiometer:\n",
    "  cbl_dir = rf\"C:\\path\\to\\your\\Microwave_radiometer\\{month_str}\\{date_str}\"\n",
    "\n",
    "\n",
    "- MWR (IWV/LWP) folder: folder_path = rf\"C:\\path\\to\\your\\Microwave_radiometer\\{month_str}\\{date_str}\"\n",
    "\n",
    "- Cloud Radar folder: cloud_radar_dir = rf\"C:\\path\\to\\your\\Cloud_radar\\{month_str}\\{date_str}\"\n",
    "\n",
    "- folder_path = rf\"C:\\path\\to\\your\\Microwave_radiometer\\{month_str}\\{date_str}\"\n",
    "    - Files expected in folder_path: \n",
    "        - profiles_data_10min_avg.parquet, \n",
    "        - MWR_vertical_dataset_10min.parquet, \n",
    "        - vertical_temperature_profiles_10min_avg.parquet\n",
    "\n",
    "- pdf_path = rf\"C:\\path\\to\\your\\Master_Thesis\\merged_columns_list.pdf\"\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "95c7cd36",
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "import numpy as np\n",
    "import matplotlib.pyplot as plt\n",
    "import os\n",
    "import datetime\n",
    "from matplotlib.lines import Line2D\n",
    "\n",
    "\n",
    "!pip install pvlib\n",
    "from scipy.special import lambertw\n",
    "\n",
    "from matplotlib.dates import DateFormatter\n",
    "from datetime import time\n",
    "import pvlib\n",
    "import seaborn as sns\n",
    "from sklearn.linear_model import LinearRegression\n",
    "import matplotlib.dates as mdates\n",
    "from sklearn.metrics import mean_absolute_error, mean_squared_error\n",
    "from numpy.polynomial.polynomial import Polynomial\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "570bed43",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Constants\n",
    "Cp = 1005  # Specific heat capacity of dry air at constant pressure (J/kg/K)\n",
    "g = 9.81   # Acceleration due to gravity (m/s^2)\n",
    "\n",
    "A=6.11*100 #Pa\n",
    "beta=0.067 #K^-1\n",
    "Ttrip=273.16 #K\n",
    "epsilon=0.622\n",
    "sigma=5.67e-8 #W*m^-2*K^-4\n",
    "Lv = 2.5e6  # Latent heat of vaporization in J/kg\n",
    "\n",
    "Rd=287.04\n",
    "Rv=461.5\n",
    "\n",
    "rho_atm=1.225 #kg/m^3\n",
    "m_co2=0.044 #kg/mole molecular mass CO2\n",
    "m_atm=0.028 #molecular mass atmosphere"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "36fcbcb2",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Function to calculate saturation pressure (es) from temperature (T)\n",
    "def calculate_saturation_pressure(T):\n",
    "    #es = A*np.exp(beta*(T+273.15-Ttrip))  # Saturation vapor pressure in Pa\n",
    "    es=610.78*np.exp(17.2694*(T-Ttrip)/(T-35.86))\n",
    "    return es\n",
    "def calculate_saturation_specific_humidity(es, p):\n",
    "    #qv = epsilon * e *1000 / (p*100) # Specific humidity in g/kg\n",
    "    qs=es*1000/(p*100+(((Rv/Rd)-1)*(p*100-es)))\n",
    "    return qs\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "14e72333",
   "metadata": {},
   "outputs": [],
   "source": [
    "###LCL\n",
    "# Function to calculate pressure at altitude z given surface pressure and temperature\n",
    "def calculate_pressure(P_surf, z, T_surf):\n",
    "    #return P_surf * np.exp(-g * z / (Rd * T))\n",
    "    return P_surf * (1+(-g*z/(T_surf*Cp)))**(Cp/Rd)\n",
    "def calculate_saturation_specific_humidity_lcl(T, p):\n",
    "    es = 610.78 * np.exp(17.2694 * (T - Ttrip) / (T - 35.86))\n",
    "    qs = es * 1000 / (((Rv / Rd) * (p * 100 - es)) + es)\n",
    "   # qs=Rd*es*1000/(Rv*p*100)\n",
    "    return qs"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "833c7f10",
   "metadata": {},
   "outputs": [],
   "source": [
    "def lcl(p, T, rh=None, rhl=None, rhs=None, return_ldl=False, return_min_lcl_ldl=False):\n",
    "    \"\"\"\n",
    "    Compute the lifting-condensation level (LCL) [m] (or deposition level LDL, or min(LCL,LDL))\n",
    "    following Romps (2017).  Inputs must be in SI units:\n",
    "\n",
    "      • p : pressure [Pa]  (positive)\n",
    "      • T : temperature [K] (positive)\n",
    "      • Exactly one of rh, rhl, rhs (each between 0 and 1)\n",
    "\n",
    "    Returns a float (or NumPy array of floats) giving height in meters.\n",
    "\n",
    "    Raises ValueError if more than one (or none) of {rh, rhl, rhs} is given,\n",
    "    or if the inputs are out of range.\n",
    "    \"\"\"\n",
    "\n",
    "    # 1) Input validation\n",
    "    p_arr = np.asarray(p, dtype=float)\n",
    "    T_arr = np.asarray(T, dtype=float)\n",
    "    if np.any(p_arr <= 0):\n",
    "        raise ValueError(\"Pressure p must be > 0 Pa.\")\n",
    "    if np.any(T_arr <= 0):\n",
    "        raise ValueError(\"Temperature T must be > 0 K.\")\n",
    "\n",
    "    provided = [rh is not None, rhl is not None, rhs is not None]\n",
    "    if sum(provided) != 1:\n",
    "        raise ValueError(f\"Exactly one of rh, rhl, rhs must be specified (you provided {sum(provided)}).\")\n",
    "\n",
    "    # If arrays are passed for rh/rhl/rhs, cast them to NumPy arrays\n",
    "    rh_arr  = np.asarray(rh,  dtype=float) if rh  is not None else None\n",
    "    rhl_arr = np.asarray(rhl, dtype=float) if rhl is not None else None\n",
    "    rhs_arr = np.asarray(rhs, dtype=float) if rhs is not None else None\n",
    "\n",
    "    for name, arr in ((\"rh\", rh_arr), (\"rhl\", rhl_arr), (\"rhs\", rhs_arr)):\n",
    "        if arr is not None and np.any((arr < 0) | (arr > 1)):\n",
    "            raise ValueError(f\"{name} must be between 0 and 1.\")\n",
    "\n",
    "    # 2) Physical constants (Romps 2017)\n",
    "    Ttrip = 273.16   # K\n",
    "    ptrip = 611.65   # Pa\n",
    "    E0v   = 2.3740e6 # J/kg\n",
    "    E0s   = 0.3337e6 # J/kg\n",
    "    ggr   = 9.81     # m/s²\n",
    "    rgasa = 287.04   # J/(kg·K)\n",
    "    rgasv = 461      # J/(kg·K)\n",
    "    cva   = 719      # J/(kg·K)\n",
    "    cvv   = 1418     # J/(kg·K)\n",
    "    cvl   = 4119     # J/(kg·K)\n",
    "    cvs   = 1861     # J/(kg·K)\n",
    "    cpa   = cva + rgasa\n",
    "    cpv   = cvv + rgasv\n",
    "\n",
    "    # 3) Helper functions for saturation vapor pressure (Pa)\n",
    "    def pvstarl(Tval):\n",
    "        # over liquid\n",
    "        return ptrip * (Tval/Ttrip)**((cpv - cvl)/rgasv) * np.exp(\n",
    "            (E0v - (cvv - cvl)*Ttrip)/rgasv * (1/Ttrip - 1/Tval)\n",
    "        )\n",
    "\n",
    "    def pvstars(Tval):\n",
    "        # over ice\n",
    "        return ptrip * (Tval/Ttrip)**((cpv - cvs)/rgasv) * np.exp(\n",
    "            (E0v + E0s - (cvv - cvs)*Ttrip)/rgasv * (1/Ttrip - 1/Tval)\n",
    "        )\n",
    "\n",
    "    # 4) Compute vapor pressure pv from whichever RH was provided\n",
    "    if rh_arr is not None:\n",
    "        # If T ≥ Ttrip, use RH over liquid; else over ice\n",
    "        pv = np.where(T_arr >= Ttrip,\n",
    "                      rh_arr * pvstarl(T_arr),\n",
    "                      rh_arr * pvstars(T_arr))\n",
    "    elif rhl_arr is not None:\n",
    "        pv = rhl_arr * pvstarl(T_arr)\n",
    "    else:  # rhs_arr is not None\n",
    "        pv = rhs_arr * pvstars(T_arr)\n",
    "\n",
    "    # 5) If pv > p anywhere, set those points to NaN (no finite LCL)\n",
    "    pv = np.where(pv > p_arr, np.nan, pv)\n",
    "\n",
    "    # 6) Recompute all three humidity ratios at the ambient T:\n",
    "    rhl_new = pv / pvstarl(T_arr)\n",
    "    rhs_new = pv / pvstars(T_arr)\n",
    "    rh_new  = np.where(T_arr >= Ttrip, rhl_new, rhs_new)\n",
    "\n",
    "    # 7) Mixed‐gas parameters\n",
    "    qv   = rgasa * pv / (rgasv * p_arr + (rgasa - rgasv) * pv)\n",
    "    rgasm= (1 - qv)*rgasa + qv*rgasv\n",
    "    cpm  = (1 - qv)*cpa + qv*cpv\n",
    "\n",
    "    # 8) Dry‐adiabatic limit (if rh = 0, no condensation)\n",
    "    dry_case = (rh_new == 0)\n",
    "    lcl_dry = cpm * T_arr / ggr\n",
    "\n",
    "    # 9) Liquid branch coefficients\n",
    "    aL = -(cpv - cvl)/rgasv + cpm/rgasm\n",
    "    bL = -(E0v - (cvv - cvl)*Ttrip) / (rgasv * T_arr)\n",
    "    cL = (pv / pvstarl(T_arr)) * np.exp(-(E0v - (cvv - cvl)*Ttrip)/(rgasv * T_arr))\n",
    "\n",
    "    # 10) Solid branch coefficients\n",
    "    aS = -(cpv - cvs)/rgasv + cpm/rgasm\n",
    "    bS = -(E0v + E0s - (cvv - cvs)*Ttrip)/(rgasv * T_arr)\n",
    "    cS = (pv / pvstars(T_arr)) * np.exp(-(E0v + E0s - (cvv - cvs)*Ttrip)/(rgasv * T_arr))\n",
    "\n",
    "    # 11) Evaluate LambertW on branch = –1\n",
    "    W_L = lambertw(bL/aL * cL**(1.0/aL), k=-1).real\n",
    "    W_S = lambertw(bS/aS * cS**(1.0/aS), k=-1).real\n",
    "\n",
    "    lcl_liquid = cpm * T_arr / ggr * (1.0 - bL/(aL * W_L))\n",
    "    lcl_solid  = cpm * T_arr / ggr * (1.0 - bS/(aS * W_S))\n",
    "\n",
    "    # 12) Combine dry/condensing results\n",
    "    lcl_all = np.where(dry_case, lcl_dry, lcl_liquid)\n",
    "    ldl_all = np.where(dry_case, lcl_dry, lcl_solid)\n",
    "\n",
    "    # 13) Handle return flags\n",
    "    if return_ldl and return_min_lcl_ldl:\n",
    "        raise ValueError(\"Cannot set both return_ldl and return_min_lcl_ldl to True.\")\n",
    "    elif return_ldl:\n",
    "        return ldl_all.astype(float)\n",
    "    elif return_min_lcl_ldl:\n",
    "        return np.minimum(lcl_all, ldl_all).astype(float)\n",
    "    else:\n",
    "        return lcl_all.astype(float)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "78bf0a9a",
   "metadata": {},
   "source": [
    "### Compare CO2 flux"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f72141cf",
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "#Edit this before running!!!\n",
    "#Path to data from other stations:\n",
    "data_vl = r\"C:\\path\\to\\your\\Fluxes_Other_Stations_May_20_26\"\n",
    "\n",
    "vl_file=os.path.join(data_vl,'Veenkampen_Fluxes.csv')\n",
    "vl=pd.read_csv(vl_file)\n",
    "loo_file=os.path.join(data_vl,'Loobos_Fluxes.csv')\n",
    "loo=pd.read_csv(loo_file)\n",
    "ams_file=os.path.join(data_vl,'Amsterdam_Fluxes.csv')\n",
    "ams=pd.read_csv(ams_file)\n",
    "\n",
    "print(loo.columns)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5c1b3f22",
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "#Veenkampen\n",
    "#print(vl['co2_flux'])\n",
    "# Convert Timestamp column to datetime format\n",
    "vl['Timestamp'] = pd.to_datetime(vl['Timestamp'], format='%Y-%m-%d %H:%M:%S',errors='coerce')\n",
    "# Convert co2_flux column to numeric, coerce errors to NaN\n",
    "vl['co2_flux'] = pd.to_numeric(vl['co2_flux'], errors='coerce')\n",
    "\n",
    "# Drop rows with NaN values in co2_flux (if any)\n",
    "vl.dropna(subset=['co2_flux'], inplace=True)\n",
    "\n",
    "# Convert µmol m-2 s-1 to ppm m s^-1\n",
    "mol_m2_s = vl['co2_flux'] / 1e6  # Convert µmol m-2 s-1 to mol m-2 s-1\n",
    "ppm_m_s = (mol_m2_s / 22.414) * 1e6   # Convert mol m-2 s-1 to ppm m s^-1\n",
    "\n",
    "# Add the converted values back to the DataFrame\n",
    "vl['F_CO2_ppm_ms'] = ppm_m_s\n",
    "\n",
    "#Loobos\n",
    "# Convert Timestamp column to datetime format\n",
    "loo['Timestamp'] = pd.to_datetime(loo['Timestamp'], format='%Y-%m-%d %H:%M:%S',errors='coerce')\n",
    "# Convert co2_flux column to numeric, coerce errors to NaN\n",
    "loo['co2_flux'] = pd.to_numeric(loo['co2_flux'], errors='coerce')\n",
    "\n",
    "# Drop rows with NaN values in co2_flux (if any)\n",
    "loo.dropna(subset=['co2_flux'], inplace=True)\n",
    "\n",
    "# Convert µmol m-2 s-1 to ppm m s^-1\n",
    "# Add the converted values back to the DataFrame\n",
    "loo['F_CO2_ppm_ms'] = ((loo['co2_flux'] / 1e6 )/ 22.414) * 1e6  \n",
    "\n",
    "#Amsterdam\n",
    "\n",
    "# Convert Timestamp column to datetime format\n",
    "ams['Timestamp'] = pd.to_datetime(ams['Timestamp'], format='%Y-%m-%d %H:%M:%S',errors='coerce')\n",
    "# Convert co2_flux column to numeric, coerce errors to NaN\n",
    "ams['co2_flux'] = pd.to_numeric(ams['co2_flux'], errors='coerce')\n",
    "\n",
    "# Drop rows with NaN values in co2_flux (if any)\n",
    "ams.dropna(subset=['co2_flux'], inplace=True)\n",
    "\n",
    "# Convert µmol m-2 s-1 to ppm m s^-1\n",
    "# Add the converted values back to the DataFrame\n",
    "ams['F_CO2_ppm_ms'] = ((ams['co2_flux'] / 1e6 )/ 22.414) * 1e6  \n",
    "\n",
    "# Plot CO2 flux vs time\n",
    "plt.figure(figsize=(12, 6))\n",
    "plt.plot(vl['Timestamp'], vl['F_CO2_ppm_ms'], marker='o', linestyle='-', color='b',label='Veenkampen')\n",
    "plt.plot(loo['Timestamp'], loo['F_CO2_ppm_ms'], marker='x', linestyle='-', color='r',label='Loobos')\n",
    "plt.plot(ams['Timestamp'], ams['F_CO2_ppm_ms'], marker='+', linestyle='-', color='g',label='Amsterdam')\n",
    "\n",
    "plt.title('CO2 Flux vs Time')\n",
    "plt.xlabel('Time')\n",
    "plt.ylabel('CO2 Flux (ppm*ms^-1)')\n",
    "plt.grid(True)\n",
    "plt.legend()\n",
    "plt.xticks(rotation=45)\n",
    "plt.tight_layout()\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "437e4f71",
   "metadata": {},
   "source": [
    "### 10 min dataset"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "cc35ffc7",
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "# 1) Single place to change:\n",
    "date_str  = '2024-05-23'    # ← only line you ever edit\n",
    "month_str = date_str[:7]    # '2024-03'\n",
    "#Edit this before running!!!\n",
    "# Define the path to the directory containing the data files\n",
    "data_dir  = rf\"C:\\path\\to\\your\\Sonic\\{month_str}\\{date_str}\"\n",
    "\n",
    "input_file = os.path.join(data_dir, 'merged_data_10min.csv')\n",
    "\n",
    "# Step 3: Load the data\n",
    "merged_data_10min = pd.read_csv(input_file)\n",
    "\n",
    "# Display the first few rows of the DataFrame to verify the data is loaded correctly\n",
    "print(merged_data_10min.columns)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5cb32e68",
   "metadata": {},
   "source": [
    "### LCL"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "95ef61dc",
   "metadata": {
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "'''\n",
    "# Filter data for the specific timestamp\n",
    "#specific_timestamp = '2024-05-03 13:10:00'\n",
    "#filtered_data = merged_data_10min[merged_data_10min['TIMESTAMP'] == specific_timestamp]\n",
    "# Heights (in meters)\n",
    "heights_m = np.array([2, 2.99, 4.47, 6.69, 10])\n",
    "\n",
    "# Extracted qv and qs values (in g/kg) at the given heights for a specific timestamp\n",
    "qv_values = np.array([filtered_data['qv_2m'].values[0], \n",
    "                      filtered_data['qv_2.99m'].values[0], \n",
    "                      filtered_data['qv_4.47m'].values[0], \n",
    "                      filtered_data['qv_6.69m'].values[0], \n",
    "                      filtered_data['qv_10m'].values[0]])\n",
    "\n",
    "qs_values = np.array([filtered_data['qs_2m'].values[0], \n",
    "                      filtered_data['qs_2.99m'].values[0], \n",
    "                      filtered_data['qs_4.47m'].values[0], \n",
    "                      filtered_data['qs_6.69m'].values[0], \n",
    "                      filtered_data['qs_10m'].values[0]])\n",
    "\n",
    "# Step 1: Linear fit (extrapolation) for qs\n",
    "# Fit a linear model for qs vs height\n",
    "qs_slope, qs_intercept = np.polyfit(heights_m, qs_values, 1)\n",
    "\n",
    "# Create a linear function for qs\n",
    "def qs_extrapolated(height):\n",
    "    return qs_slope * height + qs_intercept\n",
    "\n",
    "# Step 2: Assume constant qv (last value in qv_values array)\n",
    "qv_constant = qv_values[0]\n",
    "\n",
    "# Step 3: Find the height where qv_constant = qs_extrapolated(height)\n",
    "# Solve for height: qv_constant = qs_slope * height + qs_intercept\n",
    "LCL_height = (qv_constant - qs_intercept) / qs_slope\n",
    "\n",
    "# Plotting qv, qs and the extrapolated qs curve\n",
    "plt.figure(figsize=(10, 6))\n",
    "\n",
    "# Plot qv\n",
    "plt.plot(qv_values, heights_m, label='qv (Specific Humidity)', marker='o', linestyle='--', color='blue')\n",
    "\n",
    "# Plot qs\n",
    "plt.plot(qs_values, heights_m, label='qs (Saturation Specific Humidity)', marker='o', linestyle='-', color='orange')\n",
    "\n",
    "# Plot the extrapolated qs curve\n",
    "extrapolated_heights = np.linspace(0, LCL_height + 50, 100)  # We extend the height range a bit for visualization\n",
    "extrapolated_qs = qs_extrapolated(extrapolated_heights)\n",
    "plt.plot(extrapolated_qs, extrapolated_heights, label='Extrapolated qs', linestyle='-', color='orange', alpha=0.6)\n",
    "\n",
    "# Plot the LCL (intersection point)\n",
    "plt.axhline(LCL_height, color='green', linestyle='--', label=f'LCL at {LCL_height:.2f} m')\n",
    "\n",
    "# Annotations and labels\n",
    "plt.title(f'qv and qs Profiles with Extrapolated LCL for {specific_timestamp}')\n",
    "plt.xlabel('Specific Humidity (g/kg)')\n",
    "plt.ylabel('Height (m)')\n",
    "plt.grid()\n",
    "plt.legend()\n",
    "\n",
    "# Show the plot\n",
    "plt.show()\n",
    "\n",
    "print(f\"Estimated LCL height: {LCL_height:.2f} meters\")\n",
    "'''"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "45e0cf77",
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "'''\n",
    "P_surf = filtered_data['BP_mbar_Avg'].values[0]  # Surface pressure in mbar\n",
    "T_surf = filtered_data['Temperature_K_2.99'].values[0]  # Surface temperature in Kelvin (e.g., from Temperature_K_2 column)\n",
    "heights_km = np.linspace(0, 10, 100)  # Heights from 0 to 10 km\n",
    "# Function to calculate saturation specific humidity (qs)\n",
    "def calculate_saturation_specific_humidity(T, p):\n",
    "    es = 610.78 * np.exp(17.2694 * (T - Ttrip) / (T - 35.86))\n",
    "    qs = es * 1000 / (((Rv / Rd) * (p * 100 - es)) + es)\n",
    "   # qs=Rd*es*1000/(Rv*p*100)\n",
    "    return qs\n",
    "# Function to calculate pressure at altitude z given surface pressure and temperature\n",
    "def calculate_pressure(P_surf, z, T_surf):\n",
    "    #return P_surf * np.exp(-g * z / (Rd * T))\n",
    "    return P_surf * (1+(-g*z/(T_surf*Cp)))**(Cp/Rd)\n",
    "\n",
    "# Arrays to store qs values\n",
    "qs_values = []\n",
    "\n",
    "# Calculate qs over height\n",
    "for z in heights_km * 1000:  # Convert heights to meters\n",
    "    # Temperature at height z (K), assuming a lapse rate of -9.8 K/km\n",
    "    T_z = T_surf - 9.8 * (z / 1000)\n",
    "    \n",
    "    # Pressure at height z\n",
    "    p_z = calculate_pressure(P_surf, z, T_surf)\n",
    "    \n",
    "    # Saturation specific humidity at height z\n",
    "    qs_z = calculate_saturation_specific_humidity(T_z, p_z)\n",
    "    \n",
    "    # Store qs value\n",
    "    qs_values.append(qs_z)\n",
    "\n",
    "# Convert qs_values to a numpy array for plotting\n",
    "qs_values = np.array(qs_values)\n",
    "\n",
    "qv_extrapolated = np.interp(heights_km, heights_m, qv_values)\n",
    "\n",
    "\n",
    "\n",
    "# Find the LCL: where qv and qs intersect\n",
    "LCL_height = None\n",
    "for i in range(len(heights_km)):\n",
    "    if qv_extrapolated[i] >= qs_values[i]:  # Find the point where qv crosses qs\n",
    "        LCL_height = heights_km[i]  # Height at which they intersect\n",
    "        break\n",
    "\n",
    "# Plotting qv and qs\n",
    "plt.figure(figsize=(10, 6))\n",
    "plt.plot(qv_extrapolated, heights_km, label='Extrapolated qv (Specific Humidity)', color='blue', linestyle='--', marker='o')\n",
    "plt.plot(qs_values, heights_km, label='qs (Saturation Specific Humidity)', color='orange', linestyle='-', marker='o')\n",
    "\n",
    "# Mark the LCL\n",
    "if LCL_height is not None:\n",
    "    plt.axhline(y=LCL_height, color='green', linestyle='--', label=f'LCL at {LCL_height:.2f} km')\n",
    "\n",
    "# Annotations and labels\n",
    "plt.title('Specific Humidity (qv) and Saturation Specific Humidity (qs)')\n",
    "plt.xlabel('Specific Humidity (g/kg)')\n",
    "plt.ylabel('Height (km)')\n",
    "plt.grid()\n",
    "plt.legend()\n",
    "plt.show()\n",
    "\n",
    "if LCL_height is not None:\n",
    "    print(f\"Estimated LCL height (Cloud Base Height): {LCL_height:.2f} km\")\n",
    "else:\n",
    "    print(\"No LCL found where qv intersects qs.\")\n",
    "# Optional: Print the qs values at each height for inspection\n",
    "for h, qs in zip(heights_km, qs_values):\n",
    "    print(f\"Height: {h:.2f} km, Saturation Specific Humidity: {qs:.4f} g/kg\")\n",
    "'''"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2162405b",
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "\n",
    "\n",
    "measured_heights = np.array([2, 2.99, 4.47, 6.69, 10])\n",
    "\n",
    "# Heights in kilometers\n",
    "heights_km = np.linspace(0, 10, 100)  # Heights from 0 to 10 km\n",
    "\n",
    "\n",
    "# Prepare to calculate LCL for each timestamp in merged_data_10min\n",
    "lcl_list = []\n",
    "\n",
    "# Loop through each row in the DataFrame\n",
    "for index, row in merged_data_10min.iterrows():\n",
    "    # Extract surface pressure and temperatures for this timestamp\n",
    "    P_surf = row['BP_mbar_Avg']  # Surface pressure in mbar\n",
    "    T_surf = row['Temperature_K_2']  # Surface temperature in Kelvin\n",
    "\n",
    "    # Extract qv values for this timestamp\n",
    "    qv_values = np.array([\n",
    "        row['qv_2m'],\n",
    "        row['qv_2.99m'],\n",
    "        row['qv_4.47m'],\n",
    "        row['qv_6.69m'],\n",
    "        row['qv_10m']\n",
    "    ])  # Specific humidity in g/kg\n",
    "\n",
    "    # Arrays to store qs values\n",
    "    qs_values = []\n",
    "\n",
    "    # Calculate qs over height\n",
    "    for z in heights_km * 1000:  # Convert heights to meters\n",
    "        # Temperature at height z (K), assuming a lapse rate of -9.8 K/km\n",
    "        T_z = T_surf - 9.8 * (z / 1000)\n",
    "        \n",
    "        # Pressure at height z\n",
    "        p_z = calculate_pressure(P_surf, z, T_surf)\n",
    "        \n",
    "        # Saturation specific humidity at height z\n",
    "        qs_z = calculate_saturation_specific_humidity_lcl(T_z, p_z)\n",
    "        \n",
    "        # Store qs value\n",
    "        qs_values.append(qs_z)\n",
    "\n",
    "    # Convert qs_values to a numpy array for further processing\n",
    "    qs_values = np.array(qs_values)\n",
    "\n",
    "    # Extrapolate qv to the height range\n",
    "    qv_extrapolated = np.interp(heights_km, measured_heights / 1000, qv_values)\n",
    "\n",
    "    # Find the LCL: where qv and qs intersect\n",
    "    LCL_height = None\n",
    "    for i in range(len(heights_km)):\n",
    "        if qv_extrapolated[i] >= qs_values[i]:  # Find the point where qv crosses qs\n",
    "            LCL_height = heights_km[i]  # Height at which they intersect\n",
    "            break\n",
    "\n",
    "    # Append LCL height for this timestamp\n",
    "    lcl_list.append(LCL_height)\n",
    "\n",
    "# Convert LCL heights list to a DataFrame column\n",
    "merged_data_10min['LCL_qv'] = lcl_list\n",
    "\n",
    "\n",
    "\n",
    "# Convert the 'TIMESTAMP' column to datetime format for better plotting\n",
    "merged_data_10min['TIMESTAMP'] = pd.to_datetime(merged_data_10min['TIMESTAMP'])\n",
    "\n",
    "# Plotting LCL over time\n",
    "plt.figure(figsize=(12, 6))\n",
    "plt.plot(merged_data_10min['TIMESTAMP'], merged_data_10min['LCL_qv'], marker='o', linestyle='-')\n",
    "plt.title('Lifting Condensation Level (LCL) Over Time')\n",
    "plt.xlabel('Timestamp')\n",
    "plt.ylabel('LCL Height (km)')\n",
    "plt.xticks(rotation=45)\n",
    "plt.grid()\n",
    "plt.tight_layout()\n",
    "plt.show()\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "810e2fc0",
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "\n",
    "# Example measured heights in meters\n",
    "measured_heights = np.array([2, 2.99, 4.47, 6.69, 10])  # Measured heights\n",
    "heights_km = np.linspace(0, 10, 1000)  # Heights in kilometers, ranging from 0 to 10 km\n",
    "\n",
    "# Prepare to calculate LCL for each timestamp and each T_surf from different heights\n",
    "lcl_dict = {f'LCL_qv_T_{h}m': [] for h in measured_heights}\n",
    "\n",
    "# Loop through each row in the DataFrame\n",
    "for index, row in merged_data_10min.iterrows():\n",
    "    \n",
    "    # Extract qv values for this timestamp (specific humidity)\n",
    "    qv_values = np.array([\n",
    "        row['qv_2m'],\n",
    "        row['qv_2.99m'],\n",
    "        row['qv_4.47m'],\n",
    "        row['qv_6.69m'],\n",
    "        row['qv_10m']\n",
    "    ])  # Specific humidity in g/kg\n",
    "\n",
    "    # Extract measured temperatures for this timestamp (in Kelvin)\n",
    "    measured_temperatures = np.array([\n",
    "        row['Temperature_K_2'],\n",
    "        row['Temperature_K_2.99'],\n",
    "        row['Temperature_K_4.47'],\n",
    "        row['Temperature_K_6.69'],\n",
    "        row['Temperature_K_10']\n",
    "    ])  # Temperatures at corresponding heights\n",
    "\n",
    "    # Arrays to store qs values (saturation specific humidity)\n",
    "    qs_values = []\n",
    "\n",
    "    # Loop through each T_surf for each height\n",
    "    for t_idx, T_surf in enumerate(measured_temperatures):\n",
    "        # Surface pressure for this timestamp (assumed constant across heights)\n",
    "        P_surf = row['BP_mbar_Avg']  # Surface pressure in mbar\n",
    "\n",
    "        # Calculate qs (saturation specific humidity) over height for this T_surf\n",
    "        qs_values = []\n",
    "        for z in heights_km * 1000:  # Convert heights from km to meters\n",
    "            # Temperature at height z (K), assuming a lapse rate of -9.8 K/km\n",
    "            T_z = T_surf - 9.8 * (z / 1000)\n",
    "\n",
    "            # Pressure at height z\n",
    "            p_z = calculate_pressure(P_surf, z, T_surf)\n",
    "\n",
    "            # Saturation specific humidity at height z\n",
    "            qs_z = calculate_saturation_specific_humidity_lcl(T_z, p_z)\n",
    "\n",
    "            # Store qs value\n",
    "            qs_values.append(qs_z)\n",
    "\n",
    "        # Convert qs_values to a numpy array for further processing\n",
    "        qs_values = np.array(qs_values)\n",
    "\n",
    "        # Extrapolate qv to the height range\n",
    "        qv_extrapolated = np.interp(heights_km, measured_heights / 1000, qv_values)\n",
    "\n",
    "        # Find the LCL: where qv and qs intersect\n",
    "        LCL_height = None\n",
    "        for i in range(len(heights_km)):\n",
    "            if qv_extrapolated[i] >= qs_values[i]:  # Find the point where qv crosses qs\n",
    "                LCL_height = heights_km[i]  # Height at which they intersect\n",
    "                break\n",
    "\n",
    "        # Append LCL height for this T_surf (for the corresponding height)\n",
    "        height_label = measured_heights[t_idx]\n",
    "        lcl_dict[f'LCL_qv_T_{height_label}m'].append(LCL_height)\n",
    "\n",
    "# Convert LCL heights list to new columns in the DataFrame\n",
    "for height_label, lcl_values in lcl_dict.items():\n",
    "    merged_data_10min[height_label] = lcl_values\n",
    "\n",
    "# Convert the 'TIMESTAMP' column to datetime format for better plotting\n",
    "merged_data_10min['TIMESTAMP'] = pd.to_datetime(merged_data_10min['TIMESTAMP'])\n",
    "\n",
    "# Plotting LCL over time for each surface temperature\n",
    "plt.figure(figsize=(12, 6))\n",
    "\n",
    "for height_label in measured_heights:\n",
    "    plt.plot(merged_data_10min['TIMESTAMP'], merged_data_10min[f'LCL_qv_T_{height_label}m'],\n",
    "             marker='o', linestyle='-', label=f'T_surf = {height_label}m')\n",
    "\n",
    "# Format the x-axis to show time in HH:MM format\n",
    "date_format = DateFormatter('%H:%M')\n",
    "plt.gca().xaxis.set_major_formatter(date_format)\n",
    "\n",
    "# Customize plot labels and title with larger fonts\n",
    "plt.title('Lifting Condensation Level (LCL) Over Time for Different Surface Temperatures', fontsize=18)\n",
    "plt.xlabel('Timestamp (UTC)', fontsize=16)\n",
    "plt.ylabel('LCL Height (km)', fontsize=16)\n",
    "\n",
    "# Rotate x-axis ticks for better readability\n",
    "plt.xticks(rotation=45, fontsize=12)\n",
    "plt.yticks(fontsize=12)\n",
    "\n",
    "# Show legend\n",
    "plt.legend(title='Surface Temperature Height', fontsize=12)\n",
    "\n",
    "# Show the grid\n",
    "plt.grid(True)\n",
    "\n",
    "# Adjust layout to prevent overlap\n",
    "plt.tight_layout()\n",
    "\n",
    "# Display plot\n",
    "plt.show()\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "52fec3d1",
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "# === Plot LCL profile for a specific timestamp (e.g., index 10) ===\n",
    "\n",
    "index = 80  # Choose the timestamp row you want to plot\n",
    "row = merged_data_10min.iloc[index]\n",
    "\n",
    "# Use T_surf from 2m as example\n",
    "T_surf = row['Temperature_K_2.99']\n",
    "P_surf = row['BP_mbar_Avg']\n",
    "\n",
    "# Get qv values\n",
    "qv_values = np.array([\n",
    "    row['qv_2m'],\n",
    "    row['qv_2.99m'],\n",
    "    row['qv_4.47m'],\n",
    "    row['qv_6.69m'],\n",
    "    row['qv_10m']\n",
    "])\n",
    "\n",
    "# Calculate qs profile\n",
    "qs_profile = []\n",
    "for z in heights_km * 1000:\n",
    "    T_z = T_surf - 9.8 * (z / 1000)  # Dry lapse rate\n",
    "    p_z = calculate_pressure(P_surf, z, T_surf)\n",
    "    qs_z = calculate_saturation_specific_humidity_lcl(T_z, p_z)\n",
    "    qs_profile.append(qs_z)\n",
    "qs_profile = np.array(qs_profile)\n",
    "\n",
    "# Interpolate qv to full height range\n",
    "qv_profile = np.interp(heights_km, measured_heights / 1000, qv_values)\n",
    "\n",
    "# Find LCL\n",
    "LCL_height = np.nan\n",
    "for i in range(len(heights_km)):\n",
    "    if qv_profile[i] >= qs_profile[i]:\n",
    "        LCL_height = heights_km[i]\n",
    "        break\n",
    "\n",
    "# Plot nicely formatted LCL diagnostic\n",
    "plt.figure(figsize=(8, 6))\n",
    "\n",
    "# Plot extrapolated qv\n",
    "plt.plot(qv_profile, heights_km, 'o', color='royalblue', markersize=4, label=r'$q_v$ (extrapolated)')\n",
    "\n",
    "# Plot qs profile\n",
    "plt.plot(qs_profile, heights_km, '-', color='darkorange', linewidth=2, label=r'$q_{sat}$ (saturation)')\n",
    "\n",
    "# Plot LCL height\n",
    "if not np.isnan(LCL_height):\n",
    "    plt.axhline(LCL_height, color='seagreen', linestyle='--', linewidth=1.5, label=fr'LCL $\\approx$ {LCL_height:.2f} km')\n",
    "\n",
    "# Axes labels and title\n",
    "plt.xlabel('Specific Humidity (g/kg)', fontsize=13)\n",
    "plt.ylabel('Height (km)', fontsize=13)\n",
    "plt.title(f'LCL Profile at {row[\"TIMESTAMP\"]:%Y-%m-%d %H:%M:%S}', fontsize=14, weight='bold')\n",
    "\n",
    "# Ticks and grid\n",
    "plt.xticks(fontsize=11)\n",
    "plt.yticks(fontsize=11)\n",
    "plt.grid(True, linestyle='--', linewidth=0.5, alpha=0.7)\n",
    "\n",
    "# Legend\n",
    "plt.legend(fontsize=11, loc='upper right')\n",
    "\n",
    "# Layout\n",
    "plt.tight_layout()\n",
    "plt.show()\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d1f8781f",
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "# 1) Pick a row (timestamp) to plot\n",
    "index = 80\n",
    "row = merged_data_10min.iloc[index]\n",
    "\n",
    "# 2) Extract mast‐level temperatures (K) at each height (m)\n",
    "measured_heights = np.array([2, 2.99, 4.47, 6.69, 10])  # in meters\n",
    "measured_temperatures = np.array([\n",
    "    row['Temperature_K_2'],\n",
    "    row['Temperature_K_2.99'],\n",
    "    row['Temperature_K_4.47'],\n",
    "    row['Temperature_K_6.69'],\n",
    "    row['Temperature_K_10']\n",
    "])\n",
    "\n",
    "# 3) Surface pressure for computing pressure profile\n",
    "P_surf = row['BP_mbar_Avg']  # in mbar\n",
    "\n",
    "# 4) Build measured qv at mast levels and interpolate onto heights_km\n",
    "qv_values = np.array([\n",
    "    row['qv_2m'],\n",
    "    row['qv_2.99m'],\n",
    "    row['qv_4.47m'],\n",
    "    row['qv_6.69m'],\n",
    "    row['qv_10m']\n",
    "])\n",
    "qv_profile = np.interp(heights_km, measured_heights / 1000.0, qv_values)\n",
    "\n",
    "# 5) Colorblind‐friendly palette and distinct markers\n",
    "colors = ['#1f77b4', '#ff7f0e', '#2ca02c', '#9467bd', '#8c564b']\n",
    "markers = ['o', 'v', 's', 'D', 'X']\n",
    "\n",
    "# 6) Create a 12×6″ canvas\n",
    "fig, ax = plt.subplots(figsize=(10, 6))\n",
    "\n",
    "# 6a) Plot interpolated qv profile (black circles, larger marker)\n",
    "ax.plot(\n",
    "    qv_profile,\n",
    "    heights_km,\n",
    "    linestyle='None',\n",
    "    marker='o',\n",
    "    color='black',\n",
    "    markersize=3,\n",
    "    label=r'$q_v$ (interpolated)'\n",
    ")\n",
    "\n",
    "# 6b) Loop over each mast‐level temperature to compute qs(z)\n",
    "for t_idx, T_surf in enumerate(measured_temperatures):\n",
    "    height_m = measured_heights[t_idx]\n",
    "    color = colors[t_idx]\n",
    "    marker = markers[t_idx]\n",
    "\n",
    "    qs_profile = []\n",
    "    for z_km in heights_km:\n",
    "        z_m = z_km * 1000.0\n",
    "        T_z = T_surf - 9.8 * (z_m / 1000.0)\n",
    "        p_z = calculate_pressure(P_surf, z_m, T_surf)\n",
    "        qs_z = calculate_saturation_specific_humidity_lcl(T_z, p_z)\n",
    "        qs_profile.append(qs_z)\n",
    "    qs_profile = np.array(qs_profile)\n",
    "\n",
    "    # 6c) Determine LCL crossing (where qv_profile ≥ qs_profile)\n",
    "    LCL_height_km = np.nan\n",
    "    for i_h, z_km in enumerate(heights_km):\n",
    "        if qv_profile[i_h] >= qs_profile[i_h]:\n",
    "            LCL_height_km = z_km\n",
    "            break\n",
    "\n",
    "    # 6d) Plot the saturation‐specific‐humidity curve (larger marker)\n",
    "    ax.plot(\n",
    "        qs_profile,\n",
    "        heights_km,\n",
    "        linestyle='None',\n",
    "        marker=marker,\n",
    "        markersize=3,\n",
    "        color=color,\n",
    "        label=fr'$q_{{sat}}$ (T$_{{{height_m:.2f}\\,m}}$)'\n",
    "    )\n",
    "\n",
    "    # 6e) If LCL is found, draw a horizontal line (thicker)\n",
    "    if not np.isnan(LCL_height_km):\n",
    "        ax.axhline(\n",
    "            LCL_height_km,\n",
    "            color=color,\n",
    "            linestyle=':',\n",
    "            linewidth=2.0,\n",
    "            label=fr'LCL T$_{{{height_m:.2f}\\,m}} \\approx {LCL_height_km:.2f}$ km'\n",
    "        )\n",
    "\n",
    "# 7) Final formatting\n",
    "\n",
    "# 7a) Axis labels and title (larger, bold)\n",
    "ax.set_xlabel('q$_v$ (g/kg)', fontsize=16, fontweight='bold')\n",
    "ax.set_ylabel('Height (km)', fontsize=16, fontweight='bold')\n",
    "ax.set_title(\n",
    "    f'LCL Profiles at {row[\"TIMESTAMP\"]:%Y-%m-%d %H:%M:%S}',\n",
    "    fontsize=18,\n",
    "    fontweight='bold'\n",
    ")\n",
    "\n",
    "# 7b) Tick parameters (major = 14 pt, thicker ticks)\n",
    "ax.tick_params(axis='both', which='major', labelsize=14, width=1.5, length=6)\n",
    "ax.tick_params(axis='both', which='minor', width=1.0, length=4)\n",
    "\n",
    "# 7c) Thicken spines to match other figures\n",
    "for spine in ax.spines.values():\n",
    "    spine.set_linewidth(1.5)\n",
    "\n",
    "# 7d) Grid styling (major dashed, minor dotted)\n",
    "ax.grid(True, which='major', linestyle='--', linewidth=0.8, alpha=0.7)\n",
    "ax.grid(True, which='minor', linestyle=':', linewidth=0.5, alpha=0.5)\n",
    "\n",
    "# 7e) Legend inside the plot area (e.g., upper right), larger font\n",
    "legend = ax.legend(\n",
    "    fontsize=12,\n",
    "    title='Legend',\n",
    "    title_fontsize=13,\n",
    "    frameon=True,\n",
    "    loc='upper right'\n",
    ")\n",
    "legend.get_frame().set_linewidth(1.5)\n",
    "\n",
    "# 7f) Limit vertical axis to 0–4 km\n",
    "ax.set_ylim(0, 4.0)\n",
    "\n",
    "# 7g) Use identical tight_layout rect so axes occupy 78% width\n",
    "plt.tight_layout(rect=[0, 0, 0.78, 1.0])\n",
    "\n",
    "# 8) Save the figure in data_dir at 300 dpi\n",
    "output_path = os.path.join(data_dir, 'LCL_qv.png')\n",
    "plt.savefig(output_path, dpi=300, bbox_inches='tight')\n",
    "print(f\"Figure saved to: {output_path}\")\n",
    "\n",
    "plt.show()\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "bdf80b4e",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Extract the IR20Dn values\n",
    "lw_dn = merged_data_10min['IR20Dn']\n",
    "Tcld = (lw_dn / sigma)**0.25  # Temperature in Kelvin\n",
    "\n",
    "merged_data_10min['Tcld']=Tcld\n",
    "\n",
    "# Add Tcld to the filtered data for reference\n",
    "#print(merged_data_10min)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9c6e46b2",
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "'''\n",
    "# Filter data for the specific timestamp\n",
    "#specific_timestamp = '2024-05-03 13:10:00'\n",
    "#filtered_data = merged_data_10min[merged_data_10min['TIMESTAMP'] == specific_timestamp]\n",
    "\n",
    "# Constants for lapse rate\n",
    "lapse_rate = -9.8  # °C/km\n",
    "max_height = 10.0  # km\n",
    "heights_km = np.arange(0, max_height + 0.1, 0.1)  # Heights from 0 to 10 km in increments of 0.1 km\n",
    "\n",
    "# Measured heights and temperatures (in Kelvin)\n",
    "measured_heights = np.array([2, 2.99, 4.47, 6.69, 10])  # Heights in meters\n",
    "measured_temperatures = np.array([filtered_data['Temperature_K_2'].values[0],\n",
    "                                   filtered_data['Temperature_K_2.99'].values[0],\n",
    "                                   filtered_data['Temperature_K_4.47'].values[0],\n",
    "                                   filtered_data['Temperature_K_6.69'].values[0],\n",
    "                                   filtered_data['Temperature_K_10'].values[0]])  # Temperatures in Kelvin\n",
    "\n",
    "# Extrapolate each measured temperature\n",
    "extrapolated_temperatures = []\n",
    "for temp, height in zip(measured_temperatures, measured_heights):\n",
    "    # Calculate temperature at each height using lapse rate\n",
    "    extrapolated_temp = temp + lapse_rate * (heights_km - height / 1000)  # Convert height from m to km\n",
    "    extrapolated_temperatures.append(extrapolated_temp)\n",
    "\n",
    "# Convert the list to an array for plotting\n",
    "extrapolated_temperatures = np.array(extrapolated_temperatures)\n",
    "\n",
    "# Initialize list to store LCL data\n",
    "lcl_data = []\n",
    "\n",
    "# Calculate LCL heights\n",
    "for i in range(len(measured_temperatures)):\n",
    "    # Find where the extrapolated temperature equals Tcld\n",
    "    tcld_value = filtered_data['Tcld'].values[0]\n",
    "    for j, temp in enumerate(extrapolated_temperatures[i]):\n",
    "        if temp <= tcld_value:\n",
    "            lcl_height_km = heights_km[j]  # Height in km where Tcld is reached\n",
    "            lcl_data.append({'Measured Height (m)': measured_heights[i], 'LCL Height (km)': lcl_height_km})\n",
    "            break\n",
    "\n",
    "# Create DataFrame from LCL data\n",
    "lcl_df = pd.DataFrame(lcl_data)\n",
    "\n",
    "# Plotting\n",
    "plt.figure(figsize=(10, 6))\n",
    "\n",
    "# Plot extrapolated temperatures for each measured temperature\n",
    "for i in range(len(measured_temperatures)):\n",
    "    plt.plot(extrapolated_temperatures[i], heights_km, label=f'Extrapolated from {measured_heights[i]} m', linestyle='--')\n",
    "\n",
    "# Plot measured temperatures\n",
    "plt.scatter(measured_temperatures, measured_heights / 1000, color='orange', label='Measured Temperatures', zorder=5)\n",
    "\n",
    "# Add vertical line for Tcld\n",
    "plt.axvline(x=filtered_data['Tcld'].values[0], color='blue', linestyle='-', label=f'Tcld: {filtered_data[\"Tcld\"].values[0]:.2f} K')\n",
    "\n",
    "# Plot LCL heights\n",
    "#plt.scatter(lcl_df['LCL Height (km)'], lcl_df['Measured Height (m)']/1000, color='red', label='LCL Heights', zorder=5)\n",
    "\n",
    "# Annotations and legend\n",
    "plt.title('Extrapolated Temperature Profiles vs Height and LCL Levels')\n",
    "plt.xlabel('Temperature (K)')\n",
    "plt.ylabel('Height (km)')\n",
    "plt.grid()\n",
    "plt.legend()\n",
    "plt.show()\n",
    "\n",
    "# Display LCL DataFrame\n",
    "print(lcl_df)\n",
    "'''"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "848dfb46",
   "metadata": {
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "# Constants for lapse rate\n",
    "lapse_rate = -9.8  # °C/km\n",
    "max_height = 10.0  # km\n",
    "heights_km = np.arange(0, max_height + 0.1, 0.01)  # Heights from 0 to 10 km in increments of 0.1 km\n",
    "\n",
    "# Measured heights and temperatures (in Kelvin)\n",
    "measured_heights = np.array([2, 2.99, 4.47, 6.69, 10])  # Heights in meters\n",
    "\n",
    "# Initialize a list to store LCL heights for each timestamp\n",
    "lcl_list = []\n",
    "\n",
    "# Loop through each row in the DataFrame\n",
    "for index, row in merged_data_10min.iterrows():\n",
    "    # Get Tcld for the current timestamp\n",
    "    tcld_value = row['Tcld']\n",
    "    \n",
    "    # Measured temperatures for this row\n",
    "    measured_temperatures = np.array([\n",
    "        row['Temperature_K_2'],\n",
    "        row['Temperature_K_2.99'],\n",
    "        row['Temperature_K_4.47'],\n",
    "        row['Temperature_K_6.69'],\n",
    "        row['Temperature_K_10']\n",
    "    ])  # Temperatures in Kelvin\n",
    "\n",
    "    # Initialize a list for LCL heights for this row\n",
    "    lcl_heights = []\n",
    "\n",
    "    # Extrapolate each measured temperature\n",
    "    for i in range(len(measured_temperatures)):\n",
    "        temp = measured_temperatures[i]\n",
    "        found_lcl = False\n",
    "        for j, h in enumerate(heights_km):\n",
    "            extrapolated_temp = temp + lapse_rate * (h - (measured_heights[i] / 1000))\n",
    "            if extrapolated_temp <= tcld_value:\n",
    "                lcl_heights.append(h)  # Height in km where Tcld is reached\n",
    "                found_lcl = True\n",
    "                break\n",
    "        if not found_lcl:\n",
    "            lcl_heights.append(np.nan)  # No LCL found for this temperature\n",
    "\n",
    "    # Append the list of LCL heights for this timestamp\n",
    "    lcl_list.append(lcl_heights)\n",
    "    #print(lcl_list)\n",
    "# Convert LCL heights list to a DataFrame column\n",
    "#lcl_array = pd.DataFrame(lcl_list, columns=['LCL_2m', 'LCL_2.99m', 'LCL_4.47m', 'LCL_6.69m', 'LCL_10m'])\n",
    "\n",
    "# Combine the LCL heights with the original DataFrame\n",
    "#merged_data_10min = pd.concat([merged_data_10min, lcl_array], axis=1)\n",
    "\n",
    "# Display the DataFrame with LCL heights\n",
    "#print(merged_data_10min[['TIMESTAMP', 'LCL_2m', 'LCL_2.99m', 'LCL_4.47m', 'LCL_6.69m', 'LCL_10m']])#LCL in km\n",
    "# Add the LCL heights as a new column to the DataFrame\n",
    "merged_data_10min['LCL'] = lcl_list\n",
    "\n",
    "# Display the DataFrame with LCL heights\n",
    "print(merged_data_10min[['TIMESTAMP', 'LCL']])\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d7472691",
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "# 1) Select the same timestamp as before\n",
    "index = 80\n",
    "row = merged_data_10min.iloc[index]\n",
    "\n",
    "# 2) Extract mast heights (m) and surface temperatures (K)\n",
    "measured_heights = np.array([2, 2.99, 4.47, 6.69, 10])  # meters\n",
    "measured_temperatures = np.array([\n",
    "    row['Temperature_K_2'],\n",
    "    row['Temperature_K_2.99'],\n",
    "    row['Temperature_K_4.47'],\n",
    "    row['Temperature_K_6.69'],\n",
    "    row['Temperature_K_10']\n",
    "])\n",
    "\n",
    "# 3) Compute brightness‐temperature‐equivalent for cloud base\n",
    "T_cld = (row['IR20Dn'] / sigma) ** 0.25  # K\n",
    "\n",
    "# 4) Define a height array from 0 to 4 km\n",
    "heights_km = np.linspace(0, 4, 100)\n",
    "\n",
    "# 5) Choose consistent, color‐blind‐friendly palette and markers\n",
    "colors = ['#1f77b4', '#ff7f0e', '#2ca02c', '#9467bd', '#8c564b']\n",
    "markers = ['o', 'v', 's', 'D', 'X']\n",
    "\n",
    "# 6) Create figure and axes with the same 12×6″ size as other plots\n",
    "fig, ax = plt.subplots(figsize=(10, 6))\n",
    "\n",
    "# 6a) Loop over each mast‐level surface temperature to plot dry‐adiabatic profile\n",
    "for t_idx, T_surf in enumerate(measured_temperatures):\n",
    "    height_label = measured_heights[t_idx]\n",
    "    color = colors[t_idx]\n",
    "    marker = markers[t_idx]\n",
    "\n",
    "    # Dry‐adiabatic temperature profile: T(z) = T_surf − 9.8 K/km × z_km\n",
    "    temp_profile = T_surf - 9.8 * heights_km\n",
    "\n",
    "    # Find cloud‐base height where temp_profile ≤ T_cld\n",
    "    cbh = np.nan\n",
    "    for i, z_km in enumerate(heights_km):\n",
    "        if temp_profile[i] <= T_cld:\n",
    "            cbh = z_km\n",
    "            break\n",
    "\n",
    "    # Plot dry‐adiabatic profile with large markers\n",
    "    ax.plot(\n",
    "        temp_profile,\n",
    "        heights_km,\n",
    "        linestyle='None',\n",
    "        marker=marker,\n",
    "        markersize=3,\n",
    "        color=color,\n",
    "        label=fr'T$_{{{height_label:.2f}\\,\\mathrm{{m}}}}$ profile'\n",
    "    )\n",
    "\n",
    "    # Plot horizontal line at CBH if found\n",
    "    if not np.isnan(cbh):\n",
    "        ax.axhline(\n",
    "            cbh,\n",
    "            linestyle=':',\n",
    "            color=color,\n",
    "            linewidth=2.0,\n",
    "            label=fr'CBH T$_{{{height_label:.2f}\\,\\mathrm{{m}}}} \\approx {cbh:.2f}\\,\\mathrm{{km}}$'\n",
    "        )\n",
    "\n",
    "# 6b) Plot vertical line for brightness temperature T_cld\n",
    "ax.axvline(\n",
    "    T_cld,\n",
    "    color='black',\n",
    "    linestyle='--',\n",
    "    linewidth=2.0,\n",
    "    label=fr'$T_{{cld}}$ = {T_cld:.2f} K'\n",
    ")\n",
    "\n",
    "# 7) Final formatting\n",
    "\n",
    "# 7a) Axis labels and title with larger fonts to match other figures\n",
    "ax.set_xlabel('Temperature (K)', fontsize=16, fontweight='bold')\n",
    "ax.set_ylabel('Height (km)', fontsize=16, fontweight='bold')\n",
    "ax.set_title(\n",
    "    f'Radiative CBH Profiles at {row[\"TIMESTAMP\"]:%Y-%m-%d %H:%M:%S}',\n",
    "    fontsize=18,\n",
    "    fontweight='bold'\n",
    ")\n",
    "\n",
    "# 7b) Tick parameters: major ticks at 14 pt, thicker lines\n",
    "ax.tick_params(axis='both', which='major', labelsize=14, width=1.5, length=6)\n",
    "ax.tick_params(axis='both', which='minor', width=1.0, length=4)\n",
    "\n",
    "# 7c) Thicken spines (axis borders)\n",
    "for spine in ax.spines.values():\n",
    "    spine.set_linewidth(1.5)\n",
    "\n",
    "# 7d) Grid styling: major dashed, minor dotted\n",
    "ax.grid(True, which='major', linestyle='--', linewidth=0.8, alpha=0.7)\n",
    "ax.grid(True, which='minor', linestyle=':', linewidth=0.5, alpha=0.5)\n",
    "\n",
    "# 7e) Legend inside the plot area (upper right), with larger font\n",
    "legend = ax.legend(\n",
    "    fontsize=12,\n",
    "    title='Profiles & CBH',\n",
    "    title_fontsize=13,\n",
    "    frameon=True,\n",
    "    loc='upper right'\n",
    ")\n",
    "legend.get_frame().set_linewidth(1.5)\n",
    "\n",
    "# 7f) Restrict vertical axis to 0–4 km (same as heights_km)\n",
    "ax.set_ylim(0, 4.0)\n",
    "\n",
    "# 7g) Use identical tight_layout rectangle so axes occupy 78% width\n",
    "plt.tight_layout(rect=[0, 0, 0.78, 1.0])\n",
    "\n",
    "# 8) Save figure in data_dir at 300 dpi\n",
    "output_path = os.path.join(data_dir, 'LCL_Tcld.png')\n",
    "plt.savefig(output_path, dpi=300, bbox_inches='tight')\n",
    "print(f\"Figure saved to: {output_path}\")\n",
    "\n",
    "plt.show()\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "bb7449ce",
   "metadata": {
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "# Plot LCL at 2m over time\n",
    "plt.figure(figsize=(10, 6))\n",
    "plt.plot(merged_data_10min['TIMESTAMP'], merged_data_10min['LCL'].apply(lambda x: x[0])\n",
    ", marker='o', label='LCL T2m')\n",
    "plt.plot(merged_data_10min['TIMESTAMP'], merged_data_10min['LCL'].apply(lambda x: x[1])\n",
    ", marker='o', label='LCL T2.99m')\n",
    "plt.plot(merged_data_10min['TIMESTAMP'], merged_data_10min['LCL'].apply(lambda x: x[2])\n",
    ", marker='o', label='LCL T4.47m')\n",
    "plt.plot(merged_data_10min['TIMESTAMP'], merged_data_10min['LCL'].apply(lambda x: x[3])\n",
    ", marker='o', label='LCL T6.69m')\n",
    "plt.plot(merged_data_10min['TIMESTAMP'], merged_data_10min['LCL'].apply(lambda x: x[4])\n",
    ", marker='o', label='LCL T10m')\n",
    "plt.plot(merged_data_10min['TIMESTAMP'], merged_data_10min['LCL_qv'], marker='o', linestyle='-',label='LCL from qv')\n",
    "\n",
    "plt.xlabel('Timestamp')\n",
    "plt.ylabel('LCL Height (km)')\n",
    "plt.title('LCL at 2m Over Time')\n",
    "plt.xticks(rotation=45)  # Rotate timestamp labels for better readability\n",
    "plt.grid(True)\n",
    "plt.legend()\n",
    "plt.tight_layout()\n",
    "plt.show()\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "709557ab",
   "metadata": {},
   "outputs": [],
   "source": [
    "# (Assume merged_data_10min is already loaded, with columns:\n",
    "#    • “LCL”                → shape (n_times, 5) after vstack\n",
    "#    • “LCL_qv_T_2.0m” … etc → length‐n_times each\n",
    "#  And data_dir is defined.)\n",
    "\n",
    "# 1) Stack radiative‐LCL and qv‐LCL into 2D arrays of shape (n_times, 5)\n",
    "radiative_LCL = np.vstack(merged_data_10min[\"LCL\"].values)  # (n_times, 5)\n",
    "qv_LCL = np.column_stack([\n",
    "    merged_data_10min[\"LCL_qv_T_2.0m\"].values,\n",
    "    merged_data_10min[\"LCL_qv_T_2.99m\"].values,\n",
    "    merged_data_10min[\"LCL_qv_T_4.47m\"].values,\n",
    "    merged_data_10min[\"LCL_qv_T_6.69m\"].values,\n",
    "    merged_data_10min[\"LCL_qv_T_10.0m\"].values\n",
    "])  # (n_times, 5)\n",
    "\n",
    "# 2) Height labels and a single color for each subplot (you can re‐use colors if you like):\n",
    "height_labels = [\"2 m\", \"2.99 m\", \"4.47 m\", \"6.69 m\", \"10 m\"]\n",
    "subplot_colors = ['#1f77b4', '#ff7f0e', '#2ca02c', '#9467bd', '#8c564b']\n",
    "\n",
    "# 3) Create a 1×5 grid of subplots, share x‐ and y‐limits for uniform appearance\n",
    "fig, axes = plt.subplots(\n",
    "    nrows=1,\n",
    "    ncols=5,\n",
    "    figsize=(18, 4),  # wide enough to accommodate five panels\n",
    "    sharex=True,\n",
    "    sharey=True\n",
    ")\n",
    "\n",
    "# 4) Determine maximum LCL (both methods) over all heights/times for a common axis range\n",
    "all_vals = np.concatenate([radiative_LCL.flatten(), qv_LCL.flatten()])\n",
    "vmax = np.nanmax(all_vals) * 1.05  # a little padding\n",
    "\n",
    "for j, ax in enumerate(axes):\n",
    "    x = radiative_LCL[:, j]\n",
    "    y = qv_LCL[:, j]\n",
    "    mask_both = np.isfinite(x) & np.isfinite(y)\n",
    "\n",
    "    # 4a) Plot paired points (both methods valid) in one color\n",
    "    ax.scatter(\n",
    "        x[mask_both],\n",
    "        y[mask_both],\n",
    "        s=20,\n",
    "        alpha=0.6,\n",
    "        color=subplot_colors[j],\n",
    "        marker='o',\n",
    "        edgecolor='black'\n",
    "    )\n",
    "\n",
    "    # 4b) Plot “radiative‐only” points below y = 0.0 (if desired)\n",
    "    mask_rad_only = np.isfinite(x) & ~np.isfinite(y)\n",
    "    if mask_rad_only.sum() > 0:\n",
    "        ax.scatter(\n",
    "            x[mask_rad_only],\n",
    "            np.full(mask_rad_only.sum(), -0.05),\n",
    "            s=15,\n",
    "            color='gray',\n",
    "            marker='x',\n",
    "            alpha=0.4\n",
    "        )\n",
    "\n",
    "    # 4c) Plot “qᵥ‐only” points left of x = 0.0 (if desired)\n",
    "    mask_qv_only = ~np.isfinite(x) & np.isfinite(y)\n",
    "    if mask_qv_only.sum() > 0:\n",
    "        ax.scatter(\n",
    "            np.full(mask_qv_only.sum(), -0.05),\n",
    "            y[mask_qv_only],\n",
    "            s=30,\n",
    "            color='black',\n",
    "            marker='+',\n",
    "            alpha=0.2\n",
    "        )\n",
    "\n",
    "    # 5) Reference 1:1 line (only on the middle subplot to reduce clutter, or on all if you prefer)\n",
    "    ax.plot([0, vmax], [0, vmax], linestyle='--', color='gray', linewidth=1.2)\n",
    "\n",
    "    # 6) Formatting per subplot\n",
    "    ax.set_xlim(-0.1, vmax)\n",
    "    ax.set_ylim(-0.1, vmax)\n",
    "    ax.set_aspect('equal', 'box')\n",
    "\n",
    "    # Only label outer axes\n",
    "    if j == 0:\n",
    "        ax.set_ylabel(\"Thermo LCL (qᵥ) (km)\", fontsize=12, fontweight='bold')\n",
    "    ax.set_title(f\"{height_labels[j]}\", fontsize=12, fontweight='bold')\n",
    "    ax.tick_params(labelsize=10)\n",
    "\n",
    "# 7) Common X‐label (put below all subplots)\n",
    "fig.text(0.5, -0.02, \"Radiative LCL (km)\", ha='center', fontsize=12, fontweight='bold')\n",
    "\n",
    "# 8) Overall figure title\n",
    "fig.suptitle(\"Radiative vs. Thermodynamic (qᵥ) LCL, by Start Height\", fontsize=16, fontweight='bold')\n",
    "\n",
    "plt.tight_layout(rect=[0, 0.05, 1, 0.95])\n",
    "\n",
    "# 9) Save the figure in data_dir at 300 dpi\n",
    "#save_path = os.path.join(data_dir, \"LCL_scatter_each_height.png\")\n",
    "#plt.savefig(save_path, dpi=300, bbox_inches='tight')\n",
    "#print(f\"Figure saved to: {save_path}\")\n",
    "\n",
    "plt.show()\n",
    "# 1) Reconstruct the 2D arrays if you haven’t already:\n",
    "radiative_LCL = np.vstack(merged_data_10min[\"LCL\"].values)  \n",
    "qv_LCL = np.column_stack([\n",
    "    merged_data_10min[\"LCL_qv_T_2.0m\"].values,\n",
    "    merged_data_10min[\"LCL_qv_T_2.99m\"].values,\n",
    "    merged_data_10min[\"LCL_qv_T_4.47m\"].values,\n",
    "    merged_data_10min[\"LCL_qv_T_6.69m\"].values,\n",
    "    merged_data_10min[\"LCL_qv_T_10.0m\"].values\n",
    "])\n",
    "\n",
    "n_times, n_heights = radiative_LCL.shape\n",
    "print(f\"Total time steps: {n_times}, Heights: {n_heights}\")\n",
    "\n",
    "# 2) Build boolean masks\n",
    "valid_rad  = np.isfinite(radiative_LCL)\n",
    "valid_qv   = np.isfinite(qv_LCL)\n",
    "mask_both  = valid_rad & valid_qv\n",
    "mask_rad_only = valid_rad & (~valid_qv)\n",
    "mask_qv_only  = (~valid_rad) & valid_qv\n",
    "\n",
    "# 3) Count “both” per height\n",
    "both_counts = np.sum(mask_both, axis=0)         # length = 5\n",
    "rad_only_counts = np.sum(mask_rad_only, axis=0)\n",
    "qv_only_counts  = np.sum(mask_qv_only, axis=0)\n",
    "\n",
    "# 4) Display results\n",
    "height_labels = [\"2 m\", \"2.99 m\", \"4.47 m\", \"6.69 m\", \"10 m\"]\n",
    "print(\"\\nPoint Counts by Height (radiative vs. qᵥ):\")\n",
    "for idx, label in enumerate(height_labels):\n",
    "    print(f\"  • {label}:\")\n",
    "    print(f\"      – Both valid:     {both_counts[idx]}\")\n",
    "    print(f\"      – Radiative-only: {rad_only_counts[idx]}\")\n",
    "    print(f\"      – qᵥ-only:        {qv_only_counts[idx]}\")\n",
    "\n",
    "# 5) Totals across all heights\n",
    "total_both = both_counts.sum()\n",
    "total_rad_only = rad_only_counts.sum()\n",
    "total_qv_only  = qv_only_counts.sum()\n",
    "\n",
    "print(\"\\nOverall Totals across all heights:\")\n",
    "print(f\"  • Both valid (plotted normally): {total_both}\")\n",
    "print(f\"  • Radiative-only (plotted at y=–0.05): {total_rad_only}\")\n",
    "print(f\"  • qᵥ-only (plotted at x=–0.05):    {total_qv_only}\")\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "aa1ae735",
   "metadata": {
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "# Create the plot with a larger size for better visibility\n",
    "plt.figure(figsize=(12, 7))\n",
    "\n",
    "# Plot LCL data with different markers but less intense black (alpha adjusted)\n",
    "plt.plot(merged_data_10min['TIMESTAMP'], merged_data_10min['LCL'].apply(lambda x: x[0]), \n",
    "         marker='o', linestyle='', color='black', alpha=0.5, label='LCL T2m')\n",
    "plt.plot(merged_data_10min['TIMESTAMP'], merged_data_10min['LCL'].apply(lambda x: x[1]), \n",
    "         marker='v', linestyle='', color='black', alpha=0.5, label='LCL T2.99m')\n",
    "plt.plot(merged_data_10min['TIMESTAMP'], merged_data_10min['LCL'].apply(lambda x: x[2]), \n",
    "         marker='s', linestyle='', color='black', alpha=0.5, label='LCL T4.47m')\n",
    "plt.plot(merged_data_10min['TIMESTAMP'], merged_data_10min['LCL'].apply(lambda x: x[3]), \n",
    "         marker='D', linestyle='', color='black', alpha=0.5, label='LCL T6.69m')\n",
    "plt.plot(merged_data_10min['TIMESTAMP'], merged_data_10min['LCL'].apply(lambda x: x[4]), \n",
    "         marker='x', linestyle='', color='black', alpha=0.5, label='LCL T10m')\n",
    "\n",
    "# Now, we will add the LCL from qv with the same markers as the above plots but with a red color\n",
    "plt.plot(merged_data_10min['TIMESTAMP'], merged_data_10min['LCL_qv_T_2.0m'], \n",
    "         marker='o', linestyle='', color='red', alpha=0.5, label='LCL qv T2m')\n",
    "plt.plot(merged_data_10min['TIMESTAMP'], merged_data_10min['LCL_qv_T_2.99m'], \n",
    "         marker='v', linestyle='', color='red', alpha=0.5, label='LCL qv T2.99m')\n",
    "plt.plot(merged_data_10min['TIMESTAMP'], merged_data_10min['LCL_qv_T_4.47m'], \n",
    "         marker='s', linestyle='', color='red', alpha=0.5, label='LCL qv T4.47m')\n",
    "plt.plot(merged_data_10min['TIMESTAMP'], merged_data_10min['LCL_qv_T_6.69m'], \n",
    "         marker='D', linestyle='', color='red', alpha=0.5, label='LCL qv T6.69m')\n",
    "plt.plot(merged_data_10min['TIMESTAMP'], merged_data_10min['LCL_qv_T_10.0m'], \n",
    "         marker='x', linestyle='', color='red', alpha=0.5, label='LCL qv T10m')\n",
    "\n",
    "# Formatting the x-axis for a full day (00:00 to 23:59)\n",
    "plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))  # Format to display hours and minutes\n",
    "plt.gca().xaxis.set_major_locator(mdates.HourLocator(interval=2))   # Set tick locator to display every 2 hours\n",
    "plt.gcf().autofmt_xdate()  # Rotate time labels for better readability\n",
    "\n",
    "# Set axis labels and title with larger fonts\n",
    "plt.xlabel('Time (Full Day)', fontsize=14)\n",
    "plt.ylabel('LCL Height (km)', fontsize=14)\n",
    "plt.title('LCL at Various Heights Over Time (Full Day)', fontsize=16)\n",
    "\n",
    "# Enhance the legend with better positioning\n",
    "plt.legend(loc='upper left', fontsize=12)\n",
    "\n",
    "# Add a grid with subtle enhancements\n",
    "plt.grid(True, linestyle='--', linewidth=0.6, alpha=0.7)\n",
    "\n",
    "# Adjust layout to prevent overlap\n",
    "plt.tight_layout()\n",
    "\n",
    "# Display the plot\n",
    "plt.show()\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b74161a0",
   "metadata": {},
   "outputs": [],
   "source": [
    "# … after you load merged_data_10min via pd.read_csv(…):\n",
    "lcl_list_new = []\n",
    "\n",
    "for index, row in merged_data_10min.iterrows():\n",
    "    # 1) Extract P_surf, T_surf for this timestamp\n",
    "    P_surf_mbar = row['BP_mbar_Avg']\n",
    "    T_surf       = row['Temperature_K_2']  # or whichever level you prefer\n",
    "    \n",
    "    # 2) Extract RH at surface\n",
    "    #    (Adjust column name and /100.0 if needed.)\n",
    "    RH_surf_pct = row['RH_E5567_Avg']  \n",
    "    rh_surf = RH_surf_pct / 100.0\n",
    "\n",
    "    # 3) Convert pressure to Pa\n",
    "    p_surf = P_surf_mbar * 100.0\n",
    "\n",
    "    # 4) Compute thermodynamic LCL (in meters) via Romps function\n",
    "    #    We only need lcl(...), so leave return_ldl=False, return_min_lcl_ldl=False\n",
    "    lcl_thermo_m = lcl(p_surf, T_surf, rh=rh_surf)\n",
    "\n",
    "    # 5) If you prefer LCL in kilometers, do:\n",
    "    lcl_thermo_km = float(lcl_thermo_m) / 1000.0\n",
    "\n",
    "    # 6) Append to a list\n",
    "    lcl_list_new.append(lcl_thermo_km)\n",
    "\n",
    "# After the loop:\n",
    "merged_data_10min['LCL_romps_km'] = lcl_list_new\n",
    "print(merged_data_10min['LCL_romps_km'])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5564df9f",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Base directory containing daily CBL height outputs from the microwave radiometer\n",
    "# Edit this path to match your local setup before running.\n",
    "# Example: cbl_dir = r\"C:\\path\\to\\your\\Microwave_radiometer\\2024-05\\2024-05-23\"\n",
    "cbl_dir = rf\"C:\\path\\to\\your\\Microwave_radiometer\\{month_str}\\{date_str}\"\n",
    "\n",
    "cbl_file = os.path.join(cbl_dir, f'cbl_height_{date_str}.csv')\n",
    "\n",
    "df_cbl = pd.read_csv(cbl_file, parse_dates=['Time'])\n",
    "print(df_cbl)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9588a254",
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "# Ensure LCL timestamps are in datetime\n",
    "df_cbl['Time'] = pd.to_datetime(df_cbl['Time'])\n",
    "\n",
    "# Merge on timestamp (inner join to keep overlapping times only)\n",
    "merged_data_10min = pd.merge(merged_data_10min, df_cbl, how='inner', left_on='TIMESTAMP', right_on='Time')\n",
    "\n",
    "# Optional: remove the duplicate 'Time' column\n",
    "merged_data_10min.drop(columns=['Time'], inplace=True)\n",
    "print(merged_data_10min)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "dd13e98b",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Create the plot with a larger size for better visibility\n",
    "plt.figure(figsize=(12, 7))\n",
    "\n",
    "# Plot LCL data with different markers but less intense black (alpha adjusted)\n",
    "plt.plot(merged_data_10min['TIMESTAMP'], merged_data_10min['LCL'].apply(lambda x: x[0]), marker='o',\n",
    "        linestyle=' ', color='black', alpha=0.5, label='LCL T2m')# marker='o'\n",
    "#plt.plot(merged_data_10min['TIMESTAMP'], merged_data_10min['LCL'].apply(lambda x: x[1]), \n",
    "      #   marker='v', linestyle='', color='black', alpha=0.5, label='LCL T2.99m')\n",
    "#plt.plot(merged_data_10min['TIMESTAMP'], merged_data_10min['LCL'].apply(lambda x: x[2]), \n",
    "       #  marker='s', linestyle='', color='black', alpha=0.5, label='LCL T4.47m')\n",
    "#plt.plot(merged_data_10min['TIMESTAMP'], merged_data_10min['LCL'].apply(lambda x: x[3]), \n",
    "        # marker='D', linestyle='', color='black', alpha=0.5, label='LCL T6.69m')\n",
    "#plt.plot(merged_data_10min['TIMESTAMP'], merged_data_10min['LCL'].apply(lambda x: x[4]), \n",
    "         #marker='x', linestyle='', color='black', alpha=0.5, label='LCL T10m')\n",
    "\n",
    "# Now, we will add the LCL from qv with the same markers as the above plots but with a red color\n",
    "plt.plot(merged_data_10min['TIMESTAMP'], merged_data_10min['LCL_qv_T_2.0m'], marker='o',\n",
    "          linestyle='', color='red', alpha=0.5, label='LCL qv T2m')#marker='o',\n",
    "#plt.plot(merged_data_10min['TIMESTAMP'], merged_data_10min['LCL_qv_T_2.99m'], \n",
    "       #  marker='v', linestyle='', color='red', alpha=0.5, label='LCL qv T2.99m')\n",
    "#plt.plot(merged_data_10min['TIMESTAMP'], merged_data_10min['LCL_qv_T_4.47m'], \n",
    "        # marker='s', linestyle='', color='red', alpha=0.5, label='LCL qv T4.47m')\n",
    "#plt.plot(merged_data_10min['TIMESTAMP'], merged_data_10min['LCL_qv_T_6.69m'], \n",
    "        # marker='D', linestyle='', color='red', alpha=0.5, label='LCL qv T6.69m')\n",
    "#plt.plot(merged_data_10min['TIMESTAMP'], merged_data_10min['LCL_qv_T_10.0m'], \n",
    "         #marker='x', linestyle='', color='red', alpha=0.5, label='LCL qv T10m')\n",
    "\n",
    "# Plot CBL height as a line (in meters → convert to km to match y-axis)\n",
    "#plt.plot(merged_data_10min['TIMESTAMP'], merged_data_10min['zi'] / 1000,\n",
    " #        color='blue', linewidth=2, label='CBL Height $z_i$')\n",
    "# Plot Romps thermodynamic LCL\n",
    "plt.plot(merged_data_10min['TIMESTAMP'],\n",
    "         merged_data_10min['LCL_romps_km'], marker='o',alpha=0.5,\n",
    "         color='green', linestyle='', linewidth=2, label='LCL (Romps)')\n",
    "# Plot Parcel Method CBL height (in km) with dashed orange line\n",
    "plt.plot(merged_data_10min['TIMESTAMP'], merged_data_10min['zi_parcel'] / 1000,\n",
    "         color='orange', linestyle='-', linewidth=2, label='CBL Height $z_i$ (Parcel Method)')\n",
    "\n",
    "\n",
    "plt.fill_between(merged_data_10min['TIMESTAMP'], \n",
    "                 merged_data_10min['zi_parcel'] / 1000, \n",
    "                 merged_data_10min['LCL'].apply(lambda x: x[0]),  # or another LCL source\n",
    "                 where=(merged_data_10min['LCL'].apply(lambda x: x[0]) <= merged_data_10min['zi_parcel'] / 1000),\n",
    "                 color='orange', alpha=0.1, label='LCL <= CBL (possible cloud layer)')\n",
    "\n",
    "\n",
    "# Formatting the x-axis for a full day (00:00 to 23:59)\n",
    "plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))  # Format to display hours and minutes\n",
    "plt.gca().xaxis.set_major_locator(mdates.HourLocator(interval=2))   # Set tick locator to display every 2 hours\n",
    "plt.gcf().autofmt_xdate()  # Rotate time labels for better readability\n",
    "\n",
    "# Set axis labels and title with larger fonts\n",
    "plt.xlabel('Time (Full Day)', fontsize=20)\n",
    "plt.ylabel('Height (km)', fontsize=20)\n",
    "plt.xticks(fontsize=18)  # or any size you prefer\n",
    "plt.yticks(fontsize=18)  # or any size you prefer\n",
    "\n",
    "plt.title('LCL at Various Heights Over Time (Full Day)', fontsize=20)\n",
    "# Enhance the legend with better positioning\n",
    "plt.legend(loc='upper left', fontsize=14)\n",
    "\n",
    "# Add a grid with subtle enhancements\n",
    "plt.grid(True, linestyle='--', linewidth=0.6, alpha=0.7)\n",
    "\n",
    "# Adjust layout to prevent overlap\n",
    "plt.tight_layout()\n",
    "\n",
    "# Display the plot\n",
    "plt.show()\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2eeae69a",
   "metadata": {},
   "source": [
    "### Sonic - Mast comparisons"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "80700c9a",
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "\n",
    "# Convert the 'TIMESTAMP' column to datetime\n",
    "merged_data_10min['TIMESTAMP'] = pd.to_datetime(merged_data_10min['TIMESTAMP'])\n",
    "\n",
    "# Plotting Temperature_K_2.99 vs time and Average_Temperature_Corr vs time as scatter plots\n",
    "plt.figure(figsize=(12, 6))\n",
    "\n",
    "# Scatter plot for Temperature_K_2.99\n",
    "plt.scatter(merged_data_10min['TIMESTAMP'], merged_data_10min['Temperature_K_2.99'], label='Temperature_K_2.99', alpha=0.6, s=10, c='blue')\n",
    "\n",
    "# Scatter plot for Average_Temperature_Corr\n",
    "plt.scatter(merged_data_10min['TIMESTAMP'], merged_data_10min['Average_Temperature_Corr'], label='Average_Temperature_Corr', alpha=0.6, s=10, c='red')\n",
    "\n",
    "# Adding labels and title\n",
    "plt.xlabel('Time')\n",
    "plt.ylabel('Temperature (K)')\n",
    "plt.title('Temperature Comparison Mast - Sonic (10 min intervals)')\n",
    "plt.legend()\n",
    "plt.grid(True)\n",
    "\n",
    "# Formatting the x-axis\n",
    "plt.gca().xaxis.set_major_locator(mdates.HourLocator(interval=1))\n",
    "plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%m-%d %H:%M'))\n",
    "\n",
    "plt.xticks(rotation=45)\n",
    "\n",
    "# Show plot\n",
    "plt.tight_layout()\n",
    "# Save the plot to a file\n",
    "output_file = os.path.join(data_dir, 'temperature_comparison_plot_10min.png')\n",
    "plt.savefig(output_file)\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "03902de4",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Plotting qv_2.99m vs time and qv_sonic vs time as scatter plots\n",
    "plt.figure(figsize=(12, 6))\n",
    "\n",
    "# Scatter plot for qv_2.99m\n",
    "plt.scatter(merged_data_10min['TIMESTAMP'], merged_data_10min['qv_2.99m'], label='qv_2.99m', alpha=0.6, s=10, c='green')\n",
    "\n",
    "# Scatter plot for qv_sonic\n",
    "plt.scatter(merged_data_10min['TIMESTAMP'], merged_data_10min['qv_sonic'], label='qv_sonic', alpha=0.6, s=10, c='purple')\n",
    "\n",
    "# Adding labels and title\n",
    "plt.xlabel('Time')\n",
    "plt.ylabel('Specific Humidity (g/kg)')\n",
    "plt.title('Specific Humidity Comparison Mast - Sonic (10 min averages)')\n",
    "plt.legend()\n",
    "plt.grid(True)\n",
    "\n",
    "# Formatting the x-axis\n",
    "plt.gca().xaxis.set_major_locator(mdates.HourLocator(interval=1))\n",
    "plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%m-%d %H:%M'))\n",
    "\n",
    "plt.xticks(rotation=45)\n",
    "\n",
    "\n",
    "\n",
    "# Tight layout\n",
    "plt.tight_layout()\n",
    "\n",
    "# Save the plot to a file\n",
    "output_file = os.path.join(data_dir, 'specific_humidity_comparison_plot_10min.png')\n",
    "plt.savefig(output_file)\n",
    "\n",
    "# Show plot\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c2106c4d",
   "metadata": {},
   "outputs": [],
   "source": [
    "#Edit this before running!!!\n",
    "#MWR (IWV/LWP) folder: \n",
    "folder_path = rf\"C:\\path\\to\\your\\Microwave_radiometer\\{month_str}\\{date_str}\"\n",
    "# Load the iwv_lwp_10min_avg.csv file\n",
    "iwv_lwp_file = os.path.join(folder_path, 'iwv_lwp_10min_avg.csv')\n",
    "iwv_lwp_10min_avg = pd.read_csv(iwv_lwp_file)\n",
    "\n",
    "# Convert the 'Time' column in iwv_lwp_10min_avg to datetime if it's not already\n",
    "iwv_lwp_10min_avg['TIMESTAMP'] = pd.to_datetime(iwv_lwp_10min_avg['TIMESTAMP'])\n",
    "\n",
    "# Convert the 'TIMESTAMP' column in merged_data_10min to datetime if it's not already\n",
    "merged_data_10min['TIMESTAMP'] = pd.to_datetime(merged_data_10min['TIMESTAMP'])\n",
    "\n",
    "# Merge the data on the 'TIMESTAMP' column, using nearest match within 10 minutes\n",
    "merged_data_10min = pd.merge_asof(iwv_lwp_10min_avg.sort_values('TIMESTAMP'),\n",
    "                                  merged_data_10min.sort_values('TIMESTAMP'),\n",
    "                                  on='TIMESTAMP', direction='nearest')\n",
    "print(merged_data_10min)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "04e5c9a9",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Create a larger figure\n",
    "fig, ax = plt.subplots(figsize=(8, 6))\n",
    "\n",
    "# 1) Plot LWP_Corrected with a bolder line\n",
    "ax.plot(\n",
    "    merged_data_10min['TIMESTAMP'],\n",
    "    merged_data_10min['LWP_Corrected'],#_Corrected\n",
    "    color='blue',\n",
    "    linewidth=2.0,\n",
    "    label='LWP_Corrected'\n",
    ")\n",
    "\n",
    "# 2) Thicken all spines (axis borders)\n",
    "for spine in ax.spines.values():\n",
    "    spine.set_linewidth(1.5)\n",
    "\n",
    "# 3) Tick parameters for major & minor ticks\n",
    "ax.tick_params(axis='both', which='major', labelsize=12, width=1.5, length=6)\n",
    "ax.tick_params(axis='both', which='minor', width=1.0, length=4)\n",
    "\n",
    "# 4) Format x‐axis to show time in HH:MM\n",
    "date_format = mdates.DateFormatter('%H:%M')\n",
    "ax.xaxis.set_major_formatter(date_format)\n",
    "ax.xaxis.set_major_locator(mdates.HourLocator(interval=2))\n",
    "ax.xaxis.set_minor_locator(mdates.MinuteLocator(interval=30))\n",
    "\n",
    "# 5) Labels and title with bold font\n",
    "ax.set_title('LWP vs. Time', fontsize=18, fontweight='bold')\n",
    "ax.set_xlabel('Time', fontsize=16, fontweight='bold')\n",
    "ax.set_ylabel('LWP (g/m²)', fontsize=16, fontweight='bold')\n",
    "\n",
    "# 6) Y‐axis tick font size\n",
    "ax.tick_params(axis='y', labelsize=12)\n",
    "\n",
    "# 7) Grid styling\n",
    "ax.grid(True, which='major', linestyle='--', linewidth=0.8, alpha=0.7)\n",
    "ax.grid(True, which='minor', linestyle=':', linewidth=0.5, alpha=0.5)\n",
    "\n",
    "# 8) Rotate x‐tick labels\n",
    "plt.xticks(rotation=45)\n",
    "\n",
    "# 9) (Optional) Legend inside plot\n",
    "legend = ax.legend(fontsize=14, frameon=True)\n",
    "legend.get_frame().set_linewidth(1.5)\n",
    "\n",
    "# 10) Tight layout and save\n",
    "plt.tight_layout()\n",
    "file_path = os.path.join(folder_path, 'LWP_Corrected_vs_TIMESTAMP_report.png')\n",
    "plt.savefig(file_path, dpi=300)\n",
    "plt.show()\n",
    "\n",
    "print(f\"Plot saved to: {file_path}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "cbc61b17",
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "#Tsrfc from LWup\n",
    "merged_data_10min['T_srf']=(merged_data_10min['IR20Up']/sigma)**0.25\n",
    "merged_data_10min['esat_srf']=calculate_saturation_pressure(merged_data_10min['T_srf'])\n",
    "merged_data_10min['qsat_srf']=calculate_saturation_specific_humidity(merged_data_10min['esat_srf'],merged_data_10min['BP_mbar_Avg'])\n",
    "\n",
    "print(merged_data_10min.columns)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4fff3497",
   "metadata": {},
   "source": [
    "###  Load Cloud Radar Rain data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9ca25cb5",
   "metadata": {},
   "outputs": [],
   "source": [
    "#Edit this before running!!!\n",
    "# Define the file paths\n",
    "#Cloud Radar folder: \n",
    "cloud_radar_dir = rf\"C:\\path\\to\\your\\Cloud_radar\\{month_str}\\{date_str}\"\n",
    "rain_file = os.path.join(cloud_radar_dir, 'Rain_10min_Averages.csv')\n",
    "cloud_radar_file = os.path.join(cloud_radar_dir, \"cloud_radar_vertical_dataset_10min.parquet\")\n",
    "\n",
    "# Load the rain data\n",
    "rain_data_10min = pd.read_csv(rain_file)\n",
    "\n",
    "# Convert the 'Time' column in iwv_lwp_10min_avg to datetime if it's not already\n",
    "rain_data_10min['TIMESTAMP'] = pd.to_datetime(rain_data_10min['TIMESTAMP'])\n",
    "\n",
    "# Merge the data on the 'TIMESTAMP' column, using nearest match within 10 minutes\n",
    "merged_data_10min = pd.merge_asof(rain_data_10min.sort_values('TIMESTAMP'),\n",
    "                                  merged_data_10min.sort_values('TIMESTAMP'),\n",
    "                                  on='TIMESTAMP', direction='nearest')\n",
    "#print(merged_data_10min)\n",
    "\n",
    "\n",
    "\n",
    "cloud_radar_df = pd.read_parquet(cloud_radar_file)\n",
    "cloud_radar_df[\"timestamp\"] = pd.to_datetime(cloud_radar_df[\"timestamp\"])\n",
    "\n",
    "# 4) Rename all columns except \"TIMESTAMP\" by appending \"_CR\"\n",
    "cols_to_rename = {\n",
    "    col: f\"{col}_CR\"\n",
    "    for col in cloud_radar_df.columns\n",
    "    if col != \"timestamp\"\n",
    "}\n",
    "cloud_radar_df = cloud_radar_df.rename(columns=cols_to_rename)\n",
    "# 7) Merge the existing merged_data_10min (which uses \"TIMESTAMP\") with cloud_radar_df (which uses \"timestamp\")\n",
    "merged_data_10min = pd.merge(\n",
    "    merged_data_10min,\n",
    "    cloud_radar_df,\n",
    "    left_on=\"TIMESTAMP\",\n",
    "    right_on=\"timestamp\",\n",
    "    how=\"left\"\n",
    ")\n",
    "\n",
    "# 8) Drop the duplicate 'timestamp' column (if you only want to keep \"TIMESTAMP\")\n",
    "merged_data_10min = merged_data_10min.drop(columns=[\"timestamp\"])\n",
    "\n",
    "# 9) Inspect the result\n",
    "print(merged_data_10min.head())"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6dae48e2",
   "metadata": {},
   "source": [
    "### Flags"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7002de92",
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "# Calculate the temperature difference\n",
    "merged_data_10min['Temp_Diff'] = merged_data_10min['Temperature_K_2.99'] - merged_data_10min['Average_Temperature_Corr']\n",
    "\n",
    "# Create a flag column\n",
    "\n",
    "merged_data_10min['Flag'] = merged_data_10min['Temp_Diff'].apply(lambda x: 1 if abs(x) > 1 else 0)\n",
    "#Create the second flag column for rain condition\n",
    "merged_data_10min['Flag_Rain'] = merged_data_10min['Rain'].apply(lambda x: 1 if x > 0 else 0)\n",
    "\n",
    "# Display the first few rows of the DataFrame to verify\n",
    "# Display the first few rows of the DataFrame to verify\n",
    "print(merged_data_10min[['TIMESTAMP', 'Temperature_K_2.99', 'Average_Temperature_Corr', 'Temp_Diff', 'Rain', 'Flag', 'Flag_Rain']].head())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "fe050d38",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Create the figure\n",
    "plt.figure(figsize=(10, 6))\n",
    "\n",
    "# Plot all temperature points\n",
    "plt.scatter(merged_data_10min['TIMESTAMP'], merged_data_10min['Temperature_K_2.99'],\n",
    "            label='Temperature_K_2.99 (mast)', alpha=0.4, color='dodgerblue')\n",
    "plt.scatter(merged_data_10min['TIMESTAMP'], merged_data_10min['Average_Temperature_Corr'],\n",
    "            label='T_sonic_corrected', alpha=0.4, color='green')\n",
    "\n",
    "# Highlight flagged outliers\n",
    "flagged_data = merged_data_10min[merged_data_10min['Flag'] == 1]\n",
    "plt.scatter(flagged_data['TIMESTAMP'], flagged_data['Temperature_K_2.99'],\n",
    "            color='red', label='Flagged Outliers ($\\Delta T > 1$ K)', s=50)\n",
    "plt.scatter(flagged_data['TIMESTAMP'], flagged_data['Average_Temperature_Corr'],\n",
    "            color='red', s=50)#label='Flagged Outliers (Avg_Temp_Corr)', \n",
    "\n",
    "# Highlight rain-flagged points\n",
    "flagged_rain_data = merged_data_10min[merged_data_10min['Flag_Rain'] == 1]\n",
    "plt.scatter(flagged_rain_data['TIMESTAMP'], flagged_rain_data['Temperature_K_2.99'],\n",
    "            color='black', marker='+', label='Flagged Rain', s=100)\n",
    "plt.scatter(flagged_rain_data['TIMESTAMP'], flagged_rain_data['Average_Temperature_Corr'],\n",
    "            color='black', marker='+', s=100)#label='Flagged Rain (Avg_Temp_Corr)'\n",
    "\n",
    "# Format x-axis for time\n",
    "plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))\n",
    "plt.xticks(fontsize=18, rotation=45)\n",
    "plt.yticks(fontsize=18)\n",
    "\n",
    "# Labels and title\n",
    "plt.xlabel('Time', fontsize=20)\n",
    "plt.ylabel('Temperature (K)', fontsize=20)\n",
    "plt.title('Temperature Data with Flagged Outliers', fontsize=20, weight='bold')\n",
    "\n",
    "# Grid, legend, and layout\n",
    "plt.grid(True, linestyle='--', alpha=0.7)\n",
    "plt.legend(fontsize=14, loc='upper right')\n",
    "plt.tight_layout()\n",
    "\n",
    "# Save the plot\n",
    "output_path = os.path.join(data_dir, 'temperature_scatter_flagged.png')\n",
    "plt.savefig(output_path, dpi=300)\n",
    "\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "35151244",
   "metadata": {},
   "source": [
    "### Vertical Profiles"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9c567c90",
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "#Edit this before running!!!\n",
    "# Define the folder path and file name\n",
    "folder_path = rf\"C:\\path\\to\\your\\Microwave_radiometer\\{month_str}\\{date_str}\"\n",
    "file_name = 'profiles_data_10min_avg.parquet'\n",
    "file_path = f'{folder_path}\\\\{file_name}'\n",
    "\n",
    "rh_ah_vertical= pd.read_parquet(file_path)\n",
    "# Display the first fewf\n",
    "\n",
    "#rows of the dataframe\n",
    "#print(rh_ah_vertical)\n",
    "\n",
    "# 2) Load the “MWR_vertical_dataset_10min.parquet” in the same folder\n",
    "mwr_path = os.path.join(folder_path, \"MWR_vertical_dataset_10min.parquet\")\n",
    "mwr_vertical = pd.read_parquet(mwr_path)\n",
    "\n",
    "print(\"MWR_vertical_dataset_10min:\")\n",
    "#print(mwr_vertical.head())\n",
    "\n",
    "# 3) Ensure both DataFrames have a proper datetime column named \"timestamp\"\n",
    "rh_ah_vertical[\"Time\"] = pd.to_datetime(rh_ah_vertical[\"Time\"])\n",
    "mwr_vertical[\"Time\"] = pd.to_datetime(mwr_vertical[\"Time\"])\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4e2fa0e2",
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "# Define mast heights and corresponding RH columns\n",
    "mast_heights = [2, 2.99, 4.47, 6.69, 10]\n",
    "rh_columns = ['RH_E5567_Avg', 'RH_E5568_Avg', 'RH_E5569_Avg', 'RH_E5570_Avg', 'RH_E5571_Avg']\n",
    "qv_columns= ['qv_2m','qv_2.99m','qv_4.47m','qv_6.69m','qv_10m']\n",
    "def combine_mast_and_radiometer_data(merged_data_10min, df_radiometer):\n",
    "    # Initialize a list to store combined rows\n",
    "    combined_data_list = []\n",
    "\n",
    "    # Iterate through each row of the radiometer data\n",
    "    for i, row in df_radiometer.iterrows():\n",
    "        # Get the corresponding row from the mast data for the same timestamp\n",
    "        timestamp = row['Time']\n",
    "        mast_row = merged_data_10min[merged_data_10min['TIMESTAMP'] == timestamp]\n",
    "        \n",
    "        if not mast_row.empty:\n",
    "            # Extract RH values from the mast row\n",
    "            mast_rh_values = mast_row[rh_columns].values.flatten().tolist()\n",
    "            mast_qv_values = mast_row[qv_columns].values.flatten().tolist()\n",
    "            # Prepare the combined data\n",
    "            combined_data = {\n",
    "                'Time': timestamp,\n",
    "                'Radiometer_Altitude': row['Altitude'],\n",
    "                'Radiometer_RH_Profile': row['RH_Profile'],\n",
    "                'Radiometer_AH_Profile': row['AH_Profile'],\n",
    "                'Mast_Heights': mast_heights,\n",
    "                'Mast_RH_Profile': mast_rh_values,\n",
    "                'Mast_qv_Profile': mast_qv_values\n",
    "            }\n",
    "            \n",
    "            # Append combined data to the list\n",
    "            combined_data_list.append(combined_data)\n",
    "    \n",
    "    # Create a DataFrame from the combined list of dictionaries\n",
    "    df_combined = pd.DataFrame(combined_data_list)\n",
    "\n",
    "    return df_combined\n",
    "\n",
    "\n",
    "df_combined = combine_mast_and_radiometer_data(merged_data_10min, rh_ah_vertical)\n",
    "print(df_combined.head())  # View the combined data\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "fc8a15fe",
   "metadata": {
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "# Plot the combined RH profiles\n",
    "def plot_combined_profiles(time_index, df_combined):\n",
    "    time = df_combined['Time'][time_index]\n",
    "    \n",
    "    # Extract the radiometer and mast profiles\n",
    "    radiometer_altitude = df_combined['Radiometer_Altitude'][time_index]\n",
    "    radiometer_rh_profile = df_combined['Radiometer_RH_Profile'][time_index]\n",
    "    mast_heights = df_combined['Mast_Heights'][time_index]\n",
    "    mast_rh_profile = df_combined['Mast_RH_Profile'][time_index]\n",
    "\n",
    "    # Plot radiometer RH profile\n",
    "    plt.figure(figsize=(6, 8))\n",
    "    plt.plot(radiometer_rh_profile, radiometer_altitude, 'b-', label='Radiometer RH Profile')\n",
    "    \n",
    "    # Plot mast RH profile\n",
    "    plt.plot(mast_rh_profile, mast_heights, 'ro-', label='Mast RH Profile')\n",
    "\n",
    "    plt.xlabel('Relative Humidity (%)')\n",
    "    plt.ylabel('Altitude (m)')\n",
    "    plt.title(f'Combined RH Profiles at {time}')\n",
    "    plt.legend()\n",
    "    # Zoom in on the lower 1 km\n",
    "    #plt.ylim([0, 100])\n",
    "    \n",
    "    plt.show()\n",
    "\n",
    "# Example: Plot the combined RH profiles for the first timestamp\n",
    "plot_combined_profiles(0, df_combined)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8d3811c2",
   "metadata": {},
   "outputs": [],
   "source": [
    "merged_mwr = pd.merge(\n",
    "    df_combined,\n",
    "    mwr_vertical,\n",
    "    on=\"Time\",\n",
    "    how=\"inner\",\n",
    "    suffixes=(\"\", \"_mwr\")\n",
    ")\n",
    "\n",
    "\n",
    "\n",
    "print(\"Merged (all columns) – first 5 rows:\")\n",
    "print(merged_mwr.head())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e9f0d58d",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Define the new file path for temperature profiles\n",
    "temperature_file_name = 'vertical_temperature_profiles_10min_avg.parquet'\n",
    "temperature_file_path = f'{folder_path}\\\\{temperature_file_name}'\n",
    "\n",
    "# Load the temperature profiles data\n",
    "temperature_profiles_df = pd.read_parquet(temperature_file_path)\n",
    "print(temperature_profiles_df)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7e92a710",
   "metadata": {},
   "outputs": [],
   "source": [
    "temperature_columns = ['Temperature_K_2', 'Temperature_K_2.99', 'Temperature_K_4.47', 'Temperature_K_6.69', 'Temperature_K_10']\n",
    "vdse_columns= ['Virtual_Dry_Static_Energy_2', 'Virtual_Dry_Static_Energy_2.99', 'Virtual_Dry_Static_Energy_4.47', 'Virtual_Dry_Static_Energy_6.69', 'Virtual_Dry_Static_Energy_10']\n",
    "def combine_mast_and_temperature_data(merged_data_10min, df_temperature_profiles):\n",
    "    # Initialize a list to store combined rows\n",
    "    combined_data_temperature_list = []\n",
    "\n",
    "    # Iterate through each row of the temperature profiles data\n",
    "    for i, row in df_temperature_profiles.iterrows():\n",
    "        # Get the corresponding row from the mast data for the same timestamp\n",
    "        timestamp = row['Time']\n",
    "        mast_row = merged_data_10min[merged_data_10min['TIMESTAMP'] == timestamp]\n",
    "        \n",
    "        if not mast_row.empty:\n",
    "            # Extract temperature values from the mast row\n",
    "            mast_temp_values = mast_row[temperature_columns].values.flatten().tolist()\n",
    "            mast_vdse_values =mast_row[vdse_columns].values.flatten().tolist()\n",
    "            # Prepare the combined data\n",
    "            combined_data = {\n",
    "                'Time': timestamp,\n",
    "                'Temperature_Altitude': row['Altitude'],\n",
    "                'Temperature_Profile': row['T_Profile'],  # Adjust this if necessary\n",
    "                'Mast_Heights': mast_heights,\n",
    "                'Mast_Temperature_Profile': mast_temp_values,\n",
    "                'Mast_VDSE_Profile': mast_vdse_values\n",
    "            }\n",
    "            \n",
    "            # Append combined data to the list\n",
    "            combined_data_temperature_list.append(combined_data)\n",
    "    \n",
    "    # Create a DataFrame from the combined list of dictionaries\n",
    "    df_combined_temperature = pd.DataFrame(combined_data_temperature_list)\n",
    "\n",
    "    return df_combined_temperature\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "97e3c211",
   "metadata": {},
   "outputs": [],
   "source": [
    "df_combined_temperature = combine_mast_and_temperature_data(merged_data_10min, temperature_profiles_df)\n",
    "print(df_combined_temperature.head())  # View the combined data\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "32a7621a",
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "def plot_combined_temperature_profiles(time_index, df_combined):\n",
    "    time = df_combined['Time'][time_index]\n",
    "    \n",
    "    # Extract the radiometer and mast profiles\n",
    "    radiometer_altitude = df_combined['Temperature_Altitude'][time_index]\n",
    "    radiometer_temp_profile = df_combined['Temperature_Profile'][time_index]\n",
    "    mast_heights = df_combined['Mast_Heights'][time_index]\n",
    "    mast_temp_profile = df_combined['Mast_Temperature_Profile'][time_index]\n",
    "\n",
    "    # Plot radiometer temperature profile\n",
    "    plt.figure(figsize=(6, 8))\n",
    "    plt.plot(radiometer_temp_profile, radiometer_altitude, 'b-', label='Radiometer Temperature Profile')\n",
    "    \n",
    "    # Plot mast temperature profile\n",
    "    plt.plot(mast_temp_profile, mast_heights, 'ro-', label='Mast Temperature Profile')\n",
    "\n",
    "    plt.xlabel('Temperature (°C)')\n",
    "    plt.ylabel('Altitude (m)')\n",
    "    plt.title(f'Combined Temperature Profiles at {time}')\n",
    "    plt.legend()\n",
    "    # Zoom in on the lower 1 km if needed\n",
    "    #plt.ylim([0, 100])\n",
    "    \n",
    "    plt.show()\n",
    "\n",
    "# Example: Plot the combined temperature profiles for the first timestamp\n",
    "plot_combined_temperature_profiles(10, df_combined_temperature)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f9b4bc41",
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "# Combine df_combined with df_combined_temperature on the 'Time' column\n",
    "df_combined_all = pd.merge(\n",
    "    merged_mwr,\n",
    "    df_combined_temperature,\n",
    "    on='Time',\n",
    "    suffixes=('_RH', '_Temp')\n",
    ")\n",
    "\n",
    "# Display the combined DataFrame\n",
    "print(df_combined_all.head())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "72d3b069",
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "\n",
    "def plot_combined_profiles(time_index, df_combined_all,data_dir):\n",
    "    # Extract the time and profiles\n",
    "    time = df_combined_all['Time'][time_index]\n",
    "    \n",
    "    # Relative Humidity Profiles\n",
    "    radiometer_altitude_rh = df_combined_all['Radiometer_Altitude'][time_index]\n",
    "    radiometer_rh_profile = df_combined_all['Radiometer_RH_Profile'][time_index]\n",
    "    mast_heights_rh = df_combined_all['Mast_Heights_RH'][time_index]\n",
    "    mast_rh_profile = df_combined_all['Mast_RH_Profile'][time_index]\n",
    "    \n",
    "    # Temperature Profiles\n",
    "    radiometer_temp_profile = df_combined_all['Temperature_Profile'][time_index]\n",
    "    mast_heights_temp = df_combined_all['Mast_Heights_Temp'][time_index]\n",
    "    mast_temp_profile = df_combined_all['Mast_Temperature_Profile'][time_index]\n",
    "\n",
    "    # Create figure and subplots\n",
    "    fig, (ax1, ax2) = plt.subplots(nrows=1, ncols=2, figsize=(14, 8), sharey=True)\n",
    "    \n",
    "    # Plot RH profiles in the first subplot\n",
    "    ax1.plot(radiometer_rh_profile, radiometer_altitude_rh, 'b-', label='Radiometer RH Profile')\n",
    "    ax1.plot(mast_rh_profile, mast_heights_rh, 'ro-', label='Mast RH Profile')\n",
    "    ax1.set_xlabel('Relative Humidity (%)')\n",
    "    ax1.set_ylabel('Altitude (m)')\n",
    "    ax1.set_title(f'RH Profiles at {time}')\n",
    "    ax1.legend()\n",
    "    ax1.grid(True)\n",
    "    \n",
    "    # Plot temperature profiles in the second subplot\n",
    "    ax2.plot(radiometer_temp_profile, radiometer_altitude_rh, 'g--', label='Radiometer Temperature Profile')\n",
    "    ax2.plot(mast_temp_profile, mast_heights_temp, 'mo-', label='Mast Temperature Profile')\n",
    "    ax2.set_xlabel('Temperature (K)')\n",
    "    ax2.set_title(f'Temperature Profiles at {time}')\n",
    "    ax2.legend()\n",
    "    ax2.grid(True)\n",
    "    \n",
    "    plt.tight_layout()\n",
    "    \n",
    "    # Create save path\n",
    "    save_path = os.path.join(data_dir, f'example_vertical_combined_profiles_plot.png')\n",
    "    \n",
    "    # Save plot to a file\n",
    "    plt.savefig(save_path)\n",
    "    plt.show()\n",
    "\n",
    "# Example: Plot the combined profiles for the first timestamp\n",
    "plot_combined_profiles(19, df_combined_all, data_dir)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1f41d424",
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "print(df_combined_all) "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6152372e",
   "metadata": {},
   "source": [
    "### Cloud Type"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "11e7205f",
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "def classify_cloud_type(row):\n",
    "    rh_profile = np.array(row['Radiometer_RH_Profile'])\n",
    "    altitude_profile = np.array(row['Radiometer_Altitude'])\n",
    "    lwp = row['LWP_Corrected']#_Corrected\n",
    "    flag = row['Flag']\n",
    "    flag_rain = row['Flag_Rain']\n",
    "    csi = row['CSI']\n",
    "\n",
    "    # Find RH just above 2 km (e.g., within 2–2.5 km)\n",
    "    \n",
    "    above_2km = (\n",
    "    (altitude_profile > 1500)\n",
    "    & (altitude_profile <= 2500))\n",
    "   # & (lwp < 20))\n",
    "    rh_near_above_2km = rh_profile[above_2km]\n",
    "\n",
    "    if flag != 0 or flag_rain != 0:\n",
    "        return 'Flagged Observation'\n",
    "\n",
    "    boundary_layer_altitude = 2000  # meters\n",
    "    boundary_layer_indices = np.where(altitude_profile <= boundary_layer_altitude)\n",
    "    rh_boundary_layer = rh_profile[boundary_layer_indices]\n",
    "    mean_rh_boundary_layer = np.mean(rh_boundary_layer)\n",
    "    rh_above_boundary_layer = rh_profile[altitude_profile > boundary_layer_altitude]\n",
    "\n",
    "    if pd.notna(csi) and csi >= 0.7 and mean_rh_boundary_layer < 50 and lwp < 20:\n",
    "        return 'Clear Sky'\n",
    "    #elif mean_rh_boundary_layer > 80 and 50 <= lwp <= 300 and np.any(rh_above_boundary_layer < 60):\n",
    "    # 3) Stratocumulus: require CSI < 0.3 AND high BL humidity AND moderate LWP AND capping inversion\n",
    "    elif (pd.notna(csi) and csi < 0.5 and mean_rh_boundary_layer > 80 and 50 <= lwp <= 300 and np.mean(rh_above_boundary_layer) < 60):\n",
    "        return 'Stratocumulus'\n",
    "\n",
    "    #elif (mean_rh_boundary_layer > 80 and 50 <= lwp <= 300 and np.mean(rh_near_above_2km) < 60):\n",
    "     #   return 'Stratocumulus'\n",
    "        #pd.notna(csi) and 0.3 <= csi < 0.7 and\n",
    "    elif (60 <= mean_rh_boundary_layer <= 80 and 20 <= lwp < 50\n",
    "    #elif (60 <= rh_profile[altitude_profile <= 2000].mean() <= 80\n",
    "      #    and 20 <= lwp < 50\n",
    "          and np.all(np.diff(rh_above_boundary_layer) <= 0)):\n",
    "        return 'Shallow Cumulus'\n",
    "    return 'Unclassified'\n",
    "\n",
    "\n",
    "# Now apply this function to classify each time step\n",
    "\n",
    "# Assuming 'df_combined_all' contains the profile data and 'merged_data_10min' contains LWP and SWR data\n",
    "# Merge both dataframes on 'Time'\n",
    "df_combined_all['Time'] = pd.to_datetime(df_combined_all['Time'])\n",
    "# Rename the 'TIMESTAMP' column to 'Time' in the merged_data_10min dataframe\n",
    "merged_data_10min.rename(columns={'TIMESTAMP': 'Time'}, inplace=True)\n",
    "\n",
    "merged_data_10min['Time'] = pd.to_datetime(merged_data_10min['Time'])\n",
    "\n",
    "# Merge on 'Time'\n",
    "#df_merged = pd.merge(df_combined_all, merged_data_10min[['Time', 'LWP_Corrected']], on='Time')\n",
    "df_merged = pd.merge(\n",
    "    df_combined_all,\n",
    "    merged_data_10min[['Time', 'LWP_Corrected', 'Flag', 'Flag_Rain','CSI']],#_Corrected\n",
    "    on='Time'\n",
    ")\n",
    "\n",
    "# Apply the classification\n",
    "df_merged['Cloud_Type'] = df_merged.apply(classify_cloud_type, axis=1)\n",
    "\n",
    "# Output the classified data\n",
    "print(df_merged[['Time', 'Cloud_Type', 'LWP_Corrected','CSI']])#_Corrected\n",
    "\n",
    "# Define the full path to save the Excel file\n",
    "output_path = f'{data_dir}\\\\cloud_classification_3.xlsx'\n",
    "\n",
    "# Save the DataFrame to the Excel file\n",
    "df_merged.to_excel(output_path, index=False)\n",
    "\n",
    "# Confirmation message\n",
    "print(f\"File saved to {output_path}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b49d7972",
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "\n",
    "def plot_all_cloud_profiles(df):\n",
    "      # Create a base folder for all plots\n",
    "    plots_dir = os.path.join(data_dir, \"cloud_profiles_3\")\n",
    "    os.makedirs(plots_dir, exist_ok=True)\n",
    "\n",
    "    # Filter out flagged and unclassified observations\n",
    "    valid_df = df[~df['Cloud_Type'].isin(['Unclassified', 'Flagged Observation', 'Excluded (Flagged)'])]\n",
    "\n",
    "    # Get unique valid cloud types\n",
    "    cloud_types = valid_df['Cloud_Type'].unique()\n",
    "\n",
    "    for cloud_type in cloud_types:\n",
    "        # Create a subfolder for each cloud type\n",
    "        folder_path = os.path.join(plots_dir, cloud_type.replace(\" \", \"_\"))\n",
    "        os.makedirs(folder_path, exist_ok=True)\n",
    "\n",
    "        # Filter data for this cloud type\n",
    "        cloud_df = valid_df[valid_df['Cloud_Type'] == cloud_type]\n",
    "\n",
    "        for idx, row in cloud_df.iterrows():\n",
    "            # Common metadata\n",
    "            timestamp = pd.to_datetime(row['Time'])\n",
    "            timestamp_str = timestamp.strftime('%Y%m%d_%H%M')\n",
    "            lwp = row['LWP']#_Corrected\n",
    "\n",
    "            # === RH PROFILE PLOT ===\n",
    "            rh_profile = np.array(row['Radiometer_RH_Profile'])\n",
    "            altitude_profile = np.array(row['Radiometer_Altitude'])\n",
    "            mast_rh_profile = np.array(row['Mast_RH_Profile'])\n",
    "            mast_heights_rh = np.array(row['Mast_Heights_RH'])\n",
    "\n",
    "            fig_rh, ax_rh = plt.subplots(figsize=(5, 6))\n",
    "            ax_rh.scatter(rh_profile, altitude_profile, label='Radiometer RH Profile', color='b', marker='o')\n",
    "            ax_rh.scatter(mast_rh_profile, mast_heights_rh, label='Mast RH Profile', color='g', marker='x')\n",
    "            ax_rh.axhline(y=2000, color='r', linestyle='--', label='2 km Boundary Layer')\n",
    "            ax_rh.set_xlabel('Relative Humidity (%)')\n",
    "            ax_rh.set_ylabel('Altitude (meters)')\n",
    "            ax_rh.set_title(f'{cloud_type} - RH Profile\\nTime: {timestamp}')\n",
    "            ax_rh.legend()\n",
    "            ax_rh.annotate(f'LWP: {lwp:.2f} g/m²', xy=(0.05, 0.95), xycoords='axes fraction',\n",
    "                           fontsize=10, bbox=dict(boxstyle='round', fc='white', edgecolor='black'))\n",
    "\n",
    "            plt.tight_layout()\n",
    "            rh_filename = f'{cloud_type.replace(\" \", \"_\")}_{timestamp_str}_RH.png'\n",
    "            plt.savefig(os.path.join(folder_path, rh_filename))\n",
    "            plt.close(fig_rh)\n",
    "\n",
    "            # === TEMPERATURE PROFILE PLOT ===\n",
    "            temp_profile = np.array(row['Temperature_Profile'])\n",
    "            mast_temp_profile = np.array(row['Mast_Temperature_Profile'])\n",
    "            mast_heights_temp = np.array(row['Mast_Heights_Temp'])\n",
    "\n",
    "            fig_temp, ax_temp = plt.subplots(figsize=(5, 6))\n",
    "            ax_temp.scatter(temp_profile, altitude_profile, label='Radiometer Temp Profile', color='orange', marker='o')\n",
    "            ax_temp.scatter(mast_temp_profile, mast_heights_temp, label='Mast Temp Profile', color='purple', marker='x')\n",
    "            ax_temp.axhline(y=2000, color='r', linestyle='--', label='2 km Boundary Layer')\n",
    "            ax_temp.set_xlabel('Temperature (K)')\n",
    "            ax_temp.set_ylabel('Altitude (meters)')\n",
    "            ax_temp.set_title(f'{cloud_type} - Temperature Profile\\nTime: {timestamp}')\n",
    "            ax_temp.legend()\n",
    "            ax_temp.annotate(f'LWP: {lwp:.2f} g/m²', xy=(0.05, 0.95), xycoords='axes fraction',\n",
    "                             fontsize=10, bbox=dict(boxstyle='round', fc='white', edgecolor='black'))\n",
    "\n",
    "            plt.tight_layout()\n",
    "            temp_filename = f'{cloud_type.replace(\" \", \"_\")}_{timestamp_str}_Temp.png'\n",
    "            plt.savefig(os.path.join(folder_path, temp_filename))\n",
    "            plt.close(fig_temp)\n",
    "\n",
    "            \n",
    "    # ✅ Moved this line inside the function so it has access to `plots_dir`\n",
    "    print(f\"\\n✅ Cloud profile plots saved in: {plots_dir}\")\n",
    "\n",
    "#plot_all_cloud_profiles(df_merged)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "39d16934",
   "metadata": {},
   "source": [
    "### VERtical PRofiles"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "078da8af",
   "metadata": {
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "# Combine df_combined with df_combined_temperature on the 'Time' column\n",
    "merged_data_10min = pd.merge(\n",
    "    merged_data_10min,\n",
    "    df_combined_all,\n",
    "    on='Time',\n",
    "    how=\"left\"\n",
    ")\n",
    "# 8) Drop the duplicate 'timestamp' column (if you only want to keep \"TIMESTAMP\")\n",
    "#merged_data_10min = merged_data_10min.drop(columns=[\"Time\"])\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "51a2df4d",
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "def plot_temperature_profiles(\n",
    "    time_index: int,\n",
    "    merged_df,\n",
    "    fontsize_labels=14,\n",
    "    fontsize_ticks=12,\n",
    "    fontsize_title=16\n",
    "):\n",
    "    \"\"\"\n",
    "    Plot MWR (blue circles), CR (red triangles with black border), and Mast (red squares)\n",
    "    for a given row index in merged_df. Also adjusts axis styling for report/PowerPoint.\n",
    "\n",
    "    Parameters\n",
    "    ----------\n",
    "    time_index : int\n",
    "        Integer index of the row to plot (0 <= time_index < len(merged_df)).\n",
    "    merged_df : pandas.DataFrame\n",
    "        Your merged_data_10min DataFrame that contains the following array columns:\n",
    "          - 'Radiometer_Altitude'\n",
    "          - 'Temperature_Profile'\n",
    "          - 'temperature_altitudes_CR'\n",
    "          - 'temperature_profile_CR'\n",
    "          - 'Mast_Heights_Temp'\n",
    "          - 'Mast_Temperature_Profile'\n",
    "          - 'TIMESTAMP' (or 'Time')\n",
    "    fontsize_labels : int, optional\n",
    "        Font size for x/y labels. Default is 14.\n",
    "    fontsize_ticks : int, optional\n",
    "        Font size for tick labels. Default is 12.\n",
    "    fontsize_title : int, optional\n",
    "        Font size for the plot title. Default is 16.\n",
    "    \"\"\"\n",
    "\n",
    "    # 1) pull out that single row as a Series\n",
    "    row = merged_df.iloc[time_index]\n",
    "\n",
    "    # 2) extract MWR (“Radiometer”) arrays\n",
    "    radiometer_alt  = row['Radiometer_Altitude']       # e.g. [0,10,25,…]\n",
    "    radiometer_temp = row['Temperature_Profile']       # T_Profile from MWR\n",
    "    radiometer_thetav=row['Theta_v_Alternative']\n",
    "    # 3) extract CR (“Cloud Radar”) arrays\n",
    "    cr_alt   = row['temperature_altitudes_CR']         # e.g. [0,20,40,…]\n",
    "    cr_temp  = row['temperature_profile_CR']           # corresponding T\n",
    "    cr_thetav = row['theta_v_CR']\n",
    "    # 4) extract Mast‐measured temperature arrays\n",
    "    mast_alt  = row['Mast_Heights_Temp']               # e.g. [2,2.99,4.47,…]\n",
    "    mast_temp = row['Mast_Temperature_Profile']        # corresponding T values\n",
    "\n",
    "    # 5) get the timestamp string for the title\n",
    "    timestamp = row['Time']  # or row['Time'], depending on where you want to grab it\n",
    "    \n",
    "    # 6) Start figure\n",
    "    plt.figure(figsize=(4, 6))\n",
    "\n",
    "    # 7) Plot MWR Temperature (red solid, circle marker)\n",
    "    plt.plot(\n",
    "        radiometer_temp,\n",
    "        radiometer_alt,\n",
    "        color='red',\n",
    "        linestyle='-',\n",
    "       # marker='o',\n",
    "        markersize=4,\n",
    "        linewidth=1.5,\n",
    "        label='T(K) MWR'\n",
    "    )\n",
    "\n",
    "    # 8) Plot CR Temperature (orange solid, circle marker)\n",
    "    plt.plot(\n",
    "        cr_temp,\n",
    "        cr_alt,\n",
    "        color='red',\n",
    "        linestyle='--',\n",
    "       # marker='o',\n",
    "        markersize=4,\n",
    "        linewidth=1.5,\n",
    "        label='T(K) CR'\n",
    "    )\n",
    "\n",
    "    # 9) Plot Mast Temperature (green solid, x‐marker)\n",
    "    plt.plot(\n",
    "        mast_temp,\n",
    "        mast_alt,\n",
    "        color='green',\n",
    "        linestyle='-',\n",
    "        marker='x',\n",
    "        markersize=6,\n",
    "        linewidth=1.5,\n",
    "        label='T(K) Mast'\n",
    "    )\n",
    "\n",
    "    # 10) Plot MWR Θv (red dashed, diamond marker)\n",
    "    plt.plot(\n",
    "        radiometer_thetav,\n",
    "        radiometer_alt,\n",
    "        color='orange',\n",
    "        linestyle='-',\n",
    "       # marker='D',\n",
    "        markersize=4,\n",
    "        linewidth=1.2,\n",
    "        label='Θv(K) MWR'\n",
    "    )\n",
    "\n",
    "    # 11) Plot CR Θv (orange dashed, diamond marker)\n",
    "    plt.plot(\n",
    "        cr_thetav,\n",
    "        cr_alt,\n",
    "        color='orange',\n",
    "        linestyle='--',\n",
    "    #    marker='D',\n",
    "        markersize=4,\n",
    "        linewidth=1.2,\n",
    "        label='Θv(K) CR'\n",
    "    )\n",
    "\n",
    "    # 12) Labels and title\n",
    "    plt.xlabel('T, Θv (K)', fontsize=fontsize_labels)\n",
    "    plt.ylabel('Altitude (m)', fontsize=fontsize_labels)\n",
    "    plt.title(f'Profiles at {timestamp}', fontsize=fontsize_title)\n",
    "    plt.ylim(-100,2000)\n",
    "    # 13) Legend\n",
    "    plt.legend(loc='best', fontsize=fontsize_ticks)\n",
    "\n",
    "    # 14) Grid (light dashed)\n",
    "    plt.grid(True, linestyle='--', alpha=0.4)\n",
    "\n",
    "    # 15) Tick label sizes\n",
    "    plt.xticks(fontsize=fontsize_ticks)\n",
    "    plt.yticks(fontsize=fontsize_ticks)\n",
    "\n",
    "    # 16) Thicken axis spines\n",
    "    ax = plt.gca()\n",
    "    for spine in ['left', 'bottom', 'right', 'top']:\n",
    "        ax.spines[spine].set_linewidth(1.2)\n",
    "\n",
    "    # 17) (Optional) invert y‐axis if you want altitude increasing upward\n",
    "    # plt.gca().invert_yaxis()\n",
    "\n",
    "    plt.tight_layout()\n",
    "    plt.show()\n",
    "\n",
    "\n",
    "# ─────────────── Example of how to call it ───────────────\n",
    "# Suppose you want the 10th ten‐minute timestamp in merged_data_10min:\n",
    "# plot_temperature_profiles(9, merged_data_10min)\n",
    "plot_temperature_profiles(38, merged_data_10min)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3a3e6d27",
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "def plot_rh_profiles(\n",
    "    time_index: int,\n",
    "    merged_df,\n",
    "    fontsize_labels=14,\n",
    "    fontsize_ticks=12,\n",
    "    fontsize_title=16\n",
    "):\n",
    "    \"\"\"\n",
    "    For a given integer row index in merged_df, plot relative‐humidity profiles for:\n",
    "      • MWR RH (blue solid line with circle markers)\n",
    "      • CR RH (orange solid line with circle markers)\n",
    "      • Mast RH (green solid line with x‐markers)\n",
    "\n",
    "    merged_df must contain these columns (as array‐like lists):\n",
    "      - 'Radiometer_Altitude'\n",
    "      - 'Radiometer_RH_Profile'\n",
    "      - 'Mast_Heights_RH'\n",
    "      - 'Mast_RH_Profile'\n",
    "      - 'humidity_altitudes_CR'\n",
    "      - 'rel_humidity_profile_CR'\n",
    "      - 'TIMESTAMP' (or 'Time')\n",
    "    \"\"\"\n",
    "\n",
    "    # 1) Extract the row\n",
    "    row = merged_df.iloc[time_index]\n",
    "\n",
    "    # 2) MWR (“Radiometer”) RH arrays\n",
    "    radiometer_alt = row['Radiometer_Altitude']      # e.g. [0, 10, 25, …]\n",
    "    radiometer_rh  = row['Radiometer_RH_Profile']    # e.g. [81.7, 79.9, …]\n",
    "    radiometer_ah  = row['Radiometer_AH_Profile']      # e.g. [7.68, 7.62, …]\n",
    "\n",
    "    # 3) CR (“Cloud Radar”) RH arrays\n",
    "    cr_alt = row['humidity_altitudes_CR']            # e.g. [0, 20, 40, …]\n",
    "    cr_rh  = row['rel_humidity_profile_CR']          # e.g. [75.2, 72.1, …]\n",
    "    cr_ah  = row['abs_humidity_profile_CR']       # e.g. [5.1, 4.8, …]\n",
    "\n",
    "    # 4) Mast RH arrays\n",
    "    mast_alt = row['Mast_Heights_RH']                # e.g. [2, 2.99, 4.47, …]\n",
    "    mast_rh  = row['Mast_RH_Profile']                # e.g. [94.0, 93.5, …]\n",
    "\n",
    "    # 5) Timestamp for title\n",
    "    timestamp = row.get('TIMESTAMP', row.get('Time', ''))\n",
    "\n",
    "   # 6) Start the figure and primary axis (RH)\n",
    "    fig, ax1 = plt.subplots(figsize=(4, 6))\n",
    "\n",
    "    # 7) Plot RH on ax1 (bottom x‐axis)\n",
    "    ax1.plot(\n",
    "        radiometer_rh,\n",
    "        radiometer_alt,\n",
    "        color='blue',\n",
    "        linestyle='-',\n",
    "       # marker='o',\n",
    "        #markersize=4,\n",
    "        linewidth=1.5,\n",
    "        label='RH MWR'\n",
    "    )\n",
    "    ax1.plot(\n",
    "        cr_rh,\n",
    "        cr_alt,\n",
    "        color='black',\n",
    "        linestyle='--',\n",
    "        #marker='o',\n",
    "        #markersize=4,\n",
    "        linewidth=1.5,\n",
    "        label='RH CR'\n",
    "    )\n",
    "    ax1.plot(\n",
    "        mast_rh,\n",
    "        mast_alt,\n",
    "        color='green',\n",
    "        linestyle='-',\n",
    "        marker='x',\n",
    "        markersize=6,\n",
    "        linewidth=1.5,\n",
    "        label='RH Mast'\n",
    "    )\n",
    "\n",
    "    # 8) Configure ax1 (RH axis)\n",
    "    ax1.set_xlabel('RH (%)', fontsize=fontsize_labels)\n",
    "    ax1.set_ylabel('Altitude (m)',          fontsize=fontsize_labels)\n",
    "    ax1.tick_params(axis='x', labelsize=fontsize_ticks)\n",
    "    ax1.tick_params(axis='y', labelsize=fontsize_ticks)\n",
    "    ax1.grid(True, linestyle='--', alpha=0.4)\n",
    "    for spine in ['left', 'bottom', 'right', 'top']:\n",
    "        ax1.spines[spine].set_linewidth(1.2)\n",
    "\n",
    "    # 9) Create secondary x‐axis (AH) sharing the same y‐axis\n",
    "    ax2 = ax1.twiny()\n",
    "    ax2.plot(\n",
    "        radiometer_ah,\n",
    "        radiometer_alt,\n",
    "        color='blue',\n",
    "        linestyle='-',\n",
    "       # marker='s',\n",
    "        #markersize=4,\n",
    "        linewidth=1.2,\n",
    "        label='AH MWR'\n",
    "    )\n",
    "    ax2.plot(\n",
    "        cr_ah,\n",
    "        cr_alt,\n",
    "        color='skyblue',\n",
    "        linestyle='--',\n",
    "        #marker='v',\n",
    "       # markersize=4,\n",
    "        linewidth=1.2,\n",
    "        label='AH CR'\n",
    "    )\n",
    "\n",
    "    # 10) Configure ax2 (AH axis)\n",
    "    ax2.set_xlabel('AH (g/m³)', fontsize=fontsize_labels)\n",
    "    ax2.tick_params(axis='x', labelsize=fontsize_ticks)\n",
    "    for spine in ['left', 'bottom', 'right', 'top']:\n",
    "        ax2.spines[spine].set_linewidth(1.2)\n",
    "    plt.ylim(-100,3000)\n",
    "    # 11) Title\n",
    "    plt.title(f'Profiles at {timestamp}', fontsize=fontsize_title)\n",
    "\n",
    "    # 12) Combine legends from both axes\n",
    "    handles1, labels1 = ax1.get_legend_handles_labels()\n",
    "    handles2, labels2 = ax2.get_legend_handles_labels()\n",
    "    ax1.legend(handles1 + handles2, labels1 + labels2, loc='best', fontsize=fontsize_ticks)\n",
    "\n",
    "    plt.tight_layout()\n",
    "    plt.show()\n",
    "    \n",
    "plot_rh_profiles(68, merged_data_10min)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "dcdbf1ac",
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "\n",
    "def plot_qv_profiles(\n",
    "    time_index: int,\n",
    "    merged_df,\n",
    "    fontsize_labels=14,\n",
    "    fontsize_ticks=12,\n",
    "    fontsize_title=16\n",
    "):\n",
    "    \"\"\"\n",
    "    For a given integer row index in merged_df, plot specific‐humidity (qv) profiles for:\n",
    "      • MWR qv (blue solid line with circle markers)\n",
    "      • CR qv  (orange solid line with circle markers)\n",
    "      • Mast qv (green solid line with x‐markers)\n",
    "\n",
    "    merged_df must contain these columns (as array‐like lists):\n",
    "      - 'Altitude'                      (MWR altitudes)\n",
    "      - 'Specific Humidity (qv)'        (MWR qv profile)\n",
    "      - 'humidity_altitudes_CR'         (CR altitudes)\n",
    "      - 'specific_humidity_CR'          (CR qv profile)\n",
    "      - 'Mast_Heights_Temp'             (mast altitudes—same mast levels)\n",
    "      - 'Mast_qv_Profile'               (mast qv profile)\n",
    "      - 'TIMESTAMP' (or 'Time')\n",
    "    \"\"\"\n",
    "\n",
    "    # 1) Extract the selected row\n",
    "    row = merged_df.iloc[time_index]\n",
    "\n",
    "    # 2) MWR (“Radiometer”) qv arrays\n",
    "    mwr_alt = row['Radiometer_Altitude']                       # e.g. [0, 10, 25, …]\n",
    "    mwr_qv  = row['Specific Humidity (qv)']          # e.g. [7.27, 6.89, …]\n",
    "\n",
    "    # 3) CR (“Cloud Radar”) qv arrays\n",
    "    cr_alt = row['humidity_altitudes_CR']            # e.g. [0, 20, 40, …]\n",
    "    cr_qv  = row['specific_humidity_CR']             # e.g. [6.14, 5.98, …]\n",
    "\n",
    "    # 4) Mast qv arrays\n",
    "    mast_alt = row['Mast_Heights_RH']              # e.g. [2, 2.99, 4.47, …]\n",
    "    mast_qv  = row['Mast_qv_Profile']                # e.g. [7.12, 6.98, …]\n",
    "\n",
    "    # 5) Timestamp for title\n",
    "    timestamp = row.get('TIMESTAMP', row.get('Time', ''))\n",
    "\n",
    "    # 6) Start the figure\n",
    "    plt.figure(figsize=(4, 6))\n",
    "\n",
    "    # 7) Plot MWR qv (blue solid, circle marker)\n",
    "    plt.plot(\n",
    "        mwr_qv,\n",
    "        mwr_alt,\n",
    "        color='purple',\n",
    "        linestyle='-',\n",
    "        #marker='o',\n",
    "        #markersize=4,\n",
    "        linewidth=1.5,\n",
    "        label='qv MWR'\n",
    "    )\n",
    "\n",
    "    # 8) Plot CR qv (orange solid, circle marker)\n",
    "    plt.plot(\n",
    "        cr_qv,\n",
    "        cr_alt,\n",
    "        color='black',\n",
    "        linestyle='--',\n",
    "       # marker='o',\n",
    "        #markersize=4,\n",
    "        linewidth=1.5,\n",
    "        label='qv CR'\n",
    "    )\n",
    "\n",
    "    # 9) Plot Mast qv (green solid, x‐marker)\n",
    "    plt.plot(\n",
    "        mast_qv,\n",
    "        mast_alt,\n",
    "        color='green',\n",
    "        linestyle='-',\n",
    "        marker='x',\n",
    "        markersize=6,\n",
    "        linewidth=1.5,\n",
    "        label='qv Mast'\n",
    "    )\n",
    "\n",
    "    # 10) Labels and title\n",
    "    plt.xlabel('qv (g/kg)', fontsize=fontsize_labels)\n",
    "    plt.ylabel('Altitude (m)',             fontsize=fontsize_labels)\n",
    "    plt.title(f'Profiles at {timestamp}', fontsize=fontsize_title)\n",
    "\n",
    "    # 11) Legend\n",
    "    plt.legend(loc='best', fontsize=fontsize_ticks)\n",
    "\n",
    "    # 12) Grid (light dashed)\n",
    "    plt.grid(True, linestyle='--', alpha=0.4)\n",
    "\n",
    "    # 13) Tick label sizes\n",
    "    plt.xticks(fontsize=fontsize_ticks)\n",
    "    plt.yticks(fontsize=fontsize_ticks)\n",
    "    plt.ylim(-100,3000)\n",
    "    # 14) Thicken axis spines\n",
    "    ax = plt.gca()\n",
    "    for spine in ['left', 'bottom', 'right', 'top']:\n",
    "        ax.spines[spine].set_linewidth(1.2)\n",
    "\n",
    "    # 15) (Optional) invert y‐axis if you want altitude increasing upward\n",
    "    # plt.gca().invert_yaxis()\n",
    "\n",
    "    plt.tight_layout()\n",
    "    plt.show()\n",
    "plot_qv_profiles(48, merged_data_10min)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "6f8ea3b1",
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "def plot_all_profiles(\n",
    "    time_index: int,\n",
    "    merged_df,\n",
    "    alt_min=-10,\n",
    "    alt_max=2000,\n",
    "    fontsize_labels=20,\n",
    "    fontsize_ticks=18,\n",
    "    fontsize_title=20\n",
    "):\n",
    "    \"\"\"\n",
    "    Plot temperature/θv, RH/AH, and qv profiles side‐by‐side for a given time index.\n",
    "\n",
    "    Parameters\n",
    "    ----------\n",
    "    time_index : int\n",
    "        Integer index of the row in merged_df to plot.\n",
    "    merged_df : pandas.DataFrame\n",
    "        DataFrame containing all merged variables. Must include these columns (array‐like):\n",
    "          - 'Radiometer_Altitude'\n",
    "          - 'Temperature_Profile'\n",
    "          - 'Theta_v_Alternative'\n",
    "          - 'temperature_altitudes_CR'\n",
    "          - 'temperature_profile_CR'\n",
    "          - 'theta_v_CR'\n",
    "          - 'Mast_Heights_Temp'\n",
    "          - 'Mast_Temperature_Profile'\n",
    "          - 'Radiometer_RH_Profile'\n",
    "          - 'Radiometer_AH_Profile'\n",
    "          - 'Mast_Heights_RH'\n",
    "          - 'Mast_RH_Profile'\n",
    "          - 'humidity_altitudes_CR'\n",
    "          - 'rel_humidity_profile_CR'\n",
    "          - 'abs_humidity_profile_CR'\n",
    "          - 'Altitude'                     (MWR altitude for qv)\n",
    "          - 'Specific Humidity (qv)'\n",
    "          - 'specific_humidity_CR'\n",
    "          - 'Mast_qv_Profile'\n",
    "          - 'TIMESTAMP'  or 'Time'\n",
    "    clim_alt_max : float, optional\n",
    "        Maximum altitude (in meters) to display on the y‐axis for all subplots.\n",
    "        Default is 2000 (i.e. 0–2 km).\n",
    "    fontsize_labels : int, optional\n",
    "        Font size for x‐ and y‐axis labels. Default is 14.\n",
    "    fontsize_ticks : int, optional\n",
    "        Font size for tick labels and legend text. Default is 12.\n",
    "    fontsize_title : int, optional\n",
    "        Font size for subplot titles. Default is 16.\n",
    "    \"\"\"\n",
    "\n",
    "    # 1) Extract the single row by index\n",
    "    row = merged_df.iloc[time_index]\n",
    "\n",
    "    # 2) Timestamp string (for all subplot titles)\n",
    "    timestamp = row.get('TIMESTAMP', row.get('Time', 'Unknown Time'))\n",
    "\n",
    "    # ────────────────────────────────────────────────────────────────────────\n",
    "    # 3) Pull out Temperature & Θᵥ profiles\n",
    "    #    - MWR (“Radiometer”)\n",
    "    mwr_alt       = row['Radiometer_Altitude']       # e.g. [0, 10, 25, …]\n",
    "    mwr_temp      = row['Temperature_Profile']       # Temperature from MWR\n",
    "    mwr_thetav    = row['Theta_v_Alternative']       # Θᵥ from MWR\n",
    "\n",
    "    #    - CR (“Cloud Radar”)\n",
    "    cr_alt_T      = row['temperature_altitudes_CR']  # e.g. [0, 20, 40, …]\n",
    "    cr_temp       = row['temperature_profile_CR']    # Temperature from CR\n",
    "    cr_thetav     = row['theta_v_CR']                # Θᵥ from CR\n",
    "\n",
    "    #    - Mast (tower)\n",
    "    mast_alt_T    = row['Mast_Heights_Temp']         # e.g. [2, 2.99, 4.47, …]\n",
    "    mast_temp     = row['Mast_Temperature_Profile']  # Temperature from Mast\n",
    "    mast_vdse     = row['Mast_VDSE_Profile']\n",
    "    # (no Mast Θᵥ, since tower data only provides Temperature & Humidity;\n",
    "    # θᵥ could be computed from T & qv if needed, but here we skip Mast θᵥ)\n",
    "\n",
    "    # ────────────────────────────────────────────────────────────────────────\n",
    "    # 4) Pull out RH & AH profiles\n",
    "    #    - MWR (“Radiometer”)\n",
    "    mwr_alt_RH    = row['Radiometer_Altitude']       # same altitude vector\n",
    "    mwr_rh        = row['Radiometer_RH_Profile']     # RH from MWR\n",
    "    mwr_ah        = row['Radiometer_AH_Profile']     # AH from MWR\n",
    "\n",
    "    #    - CR (“Cloud Radar”)\n",
    "    cr_alt_RH     = row['humidity_altitudes_CR']     # e.g. [0, 20, 40, …]\n",
    "    cr_rh         = row['rel_humidity_profile_CR']   # RH from CR\n",
    "    cr_ah         = row['abs_humidity_profile_CR']   # AH from CR\n",
    "\n",
    "    #    - Mast (tower)\n",
    "    mast_alt_RH   = row['Mast_Heights_RH']           # e.g. [2, 2.99, 4.47, …]\n",
    "    mast_rh       = row['Mast_RH_Profile']           # RH from Mast\n",
    "    # (no Mast AH, but if needed, could compute from Mast qv & T)\n",
    "\n",
    "    # ────────────────────────────────────────────────────────────────────────\n",
    "    # 5) Pull out qᵥ profiles\n",
    "    #    - MWR (“Radiometer”)\n",
    "    mwr_alt_qv    = row['Radiometer_Altitude']                   # MWR altitude vector (from mwr_vertical)\n",
    "    mwr_qv        = row['Specific Humidity (qv)']      # qᵥ (g/kg) from MWR\n",
    "\n",
    "    #    - CR (“Cloud Radar”)\n",
    "    cr_alt_qv     = row['humidity_altitudes_CR']      # same CR altitude vector\n",
    "    cr_qv         = row['specific_humidity_CR']       # qᵥ (g/kg) from CR\n",
    "\n",
    "    #    - Mast (tower)\n",
    "    mast_alt_qv   = row['Mast_Heights_RH']          # same mast altitude for temperature\n",
    "    mast_qv       = row['Mast_qv_Profile']            # qᵥ (g/kg) from Mast\n",
    "\n",
    "    # ────────────────────────────────────────────────────────────────────────\n",
    "    # 6) Open figure with 3 side‐by‐side subplots, sharing the y‐axis\n",
    "    fig, (ax_T, ax_RH, ax_qv) = plt.subplots(\n",
    "        nrows=1,\n",
    "        ncols=3,\n",
    "        figsize=(12, 6),\n",
    "        sharey=True\n",
    "    )\n",
    "\n",
    "    # ────────────────────────────────────────────────────────────────────────\n",
    "    # 7) Panel 1: Temperature & Θᵥ\n",
    "    # ------------------------------------------------\n",
    "   \n",
    " # Plot Temperature (solid lines with circle markers)\n",
    "    ax_T.plot(\n",
    "        mwr_temp,\n",
    "        mwr_alt,\n",
    "        color='red',\n",
    "        linestyle='-',\n",
    "        #marker='o',\n",
    "       # markersize=4,\n",
    "        linewidth=1.5,\n",
    "        label='T (MWR)'\n",
    "    )\n",
    "    ax_T.plot(\n",
    "        cr_temp,\n",
    "        cr_alt_T,\n",
    "        color='red',\n",
    "        linestyle='--',\n",
    "       # marker='o',\n",
    "       # markersize=4,\n",
    "        linewidth=1.5,\n",
    "        label='T (CR)'\n",
    "    )\n",
    "    ax_T.plot(\n",
    "        mast_temp,\n",
    "        mast_alt_T,\n",
    "        color='green',\n",
    "        linestyle='-',\n",
    "        marker='x',\n",
    "        markersize=6,\n",
    "        linewidth=1.5,\n",
    "        label='T (Mast)'\n",
    "    )\n",
    "\n",
    "    # Plot Θᵥ (dashed lines with diamond markers)\n",
    "    ax_T.plot(\n",
    "        mwr_thetav,\n",
    "        mwr_alt,\n",
    "        color='orange',\n",
    "        linestyle='-',\n",
    "        #marker='D',\n",
    "       # markersize=4,\n",
    "        linewidth=1.2,\n",
    "        label='Θᵥ (MWR)'\n",
    "    )\n",
    "    ax_T.plot(\n",
    "        cr_thetav,\n",
    "        cr_alt_T,\n",
    "        color='orange',\n",
    "        linestyle='--',\n",
    "        #marker='D',\n",
    "        #markersize=4,\n",
    "        linewidth=1.2,\n",
    "        label='Θᵥ (CR)'\n",
    "    )\n",
    "    \n",
    "    ax_T.plot(\n",
    "        mast_vdse,\n",
    "        mast_alt_T,\n",
    "        color='black',\n",
    "        linestyle='-',\n",
    "        marker='x',\n",
    "        markersize=6,\n",
    "        linewidth=1.5,\n",
    "        label='Θᵥ (Mast)'\n",
    "    )\n",
    "    # Configure Panel 1\n",
    "    ax_T.set_xlabel('T, Θᵥ (K)', fontsize=fontsize_labels)\n",
    "    ax_T.set_ylabel('Altitude (m)', fontsize=fontsize_labels)\n",
    "    ax_T.tick_params(axis='both', labelsize=fontsize_ticks)\n",
    "    ax_T.grid(True, linestyle='--', alpha=0.4)\n",
    "    ax_T.set_ylim(alt_min, alt_max)\n",
    "    ax_T.set_xlim(275,300)\n",
    "    for spine in ax_T.spines.values():\n",
    "        spine.set_linewidth(1.2)\n",
    "\n",
    "   # ax_T.legend(loc='upper left', fontsize=fontsize_ticks - 2)\n",
    "\n",
    "    # ────────────────────────────────────────────────────────────────────────\n",
    "    # 8) Panel 2: RH (bottom x‐axis) & AH (top x‐axis)\n",
    "    # ------------------------------------------------\n",
    "    # RH on primary axis (ax_RH)\n",
    "    ax_RH.plot(\n",
    "        mwr_rh,\n",
    "        mwr_alt_RH,\n",
    "        color='blue',\n",
    "        linestyle='-',\n",
    "        #marker='o',\n",
    "        #markersize=4,\n",
    "        linewidth=1.5,\n",
    "        label='RH (MWR)'\n",
    "    )\n",
    "    ax_RH.plot(\n",
    "        cr_rh,\n",
    "        cr_alt_RH,\n",
    "        color='black',\n",
    "        linestyle='--',\n",
    "       # marker='o',\n",
    "        #markersize=4,\n",
    "        linewidth=1.5,\n",
    "        label='RH (CR)'\n",
    "    )\n",
    "    ax_RH.plot(\n",
    "        mast_rh,\n",
    "        mast_alt_RH,\n",
    "        color='green',\n",
    "        linestyle='-',\n",
    "        marker='x',\n",
    "        markersize=6,\n",
    "        linewidth=1.5,\n",
    "        label='RH (Mast)'\n",
    "    )\n",
    "\n",
    "    ax_RH.set_xlabel('RH (%)', fontsize=fontsize_labels)\n",
    "    ax_RH.set_title(f'Profiles at {timestamp}', fontsize=fontsize_title)\n",
    "    ax_RH.tick_params(axis='both', labelsize=fontsize_ticks)\n",
    "    ax_RH.grid(True, linestyle='--', alpha=0.4)\n",
    "    ax_RH.set_ylim(alt_min, alt_max)\n",
    "    ax_RH.set_xlim(45, 100)\n",
    "\n",
    "    for spine in ax_RH.spines.values():\n",
    "        spine.set_linewidth(1.2)\n",
    "\n",
    "    # Create twin x‐axis for AH\n",
    "    ax_AH = ax_RH.twiny()\n",
    "    ax_AH.plot(\n",
    "        mwr_ah,\n",
    "        mwr_alt_RH,\n",
    "        color='gray',\n",
    "        linestyle='-',\n",
    "        #marker='s',\n",
    "        #markersize=4,\n",
    "        linewidth=1.2,\n",
    "        label='AH (MWR)'\n",
    "    )\n",
    "    ax_AH.plot(\n",
    "        cr_ah,\n",
    "        cr_alt_RH,\n",
    "        color='skyblue',\n",
    "        linestyle='--',\n",
    "       # marker='v',\n",
    "        #markersize=4,\n",
    "        linewidth=1.2,\n",
    "        label='AH (CR)'\n",
    "    )\n",
    "    ax_AH.set_xlabel('AH (g/m³)', fontsize=fontsize_labels)\n",
    "    ax_AH.tick_params(axis='x', labelsize=fontsize_ticks)\n",
    "    for spine in ax_AH.spines.values():\n",
    "        spine.set_linewidth(1.2)\n",
    "\n",
    "    # Combine legends from both axes\n",
    "    #lines_RH, labels_RH = ax_RH.get_legend_handles_labels()\n",
    "    #lines_AH, labels_AH = ax_AH.get_legend_handles_labels()\n",
    "    # Manually collect handles from both RH and AH axes\n",
    "    handles_RH, labels_RH = ax_RH.get_legend_handles_labels()\n",
    "    handles_AH, labels_AH = ax_AH.get_legend_handles_labels()\n",
    "\n",
    "# Combine both into one\n",
    "    handles_all = handles_RH + handles_AH\n",
    "    labels_all = labels_RH + labels_AH\n",
    "    ax_RH.legend(handles_all, labels_all, loc='upper right', fontsize=fontsize_ticks - 2)\n",
    "\n",
    "   # ax_RH.legend(\n",
    "    #    lines_RH + lines_AH,\n",
    "     #   labels_RH + labels_AH,\n",
    "      #  loc='upper right',\n",
    "       # fontsize=fontsize_ticks - 2\n",
    "    #)\n",
    "\n",
    "  \n",
    "    # ────────────────────────────────────────────────────────────────────────\n",
    "    # 9) Panel 3: Specific Humidity (qᵥ)\n",
    "    # ------------------------------------------------\n",
    "    ax_qv.plot(\n",
    "        mwr_qv,\n",
    "        mwr_alt_qv,\n",
    "        color='purple',\n",
    "        linestyle='-',\n",
    "        #marker='o',\n",
    "        #markersize=4,\n",
    "        linewidth=1.5,\n",
    "        label='qᵥ (MWR)'\n",
    "    )\n",
    "    ax_qv.plot(\n",
    "        cr_qv,\n",
    "        cr_alt_qv,\n",
    "        color='black',\n",
    "        linestyle='--',\n",
    "        #marker='o',\n",
    "        #markersize=4,\n",
    "        linewidth=1.5,\n",
    "        label='qᵥ (CR)'\n",
    "    )\n",
    "    ax_qv.plot(\n",
    "        mast_qv,\n",
    "        mast_alt_qv,\n",
    "        color='green',\n",
    "        linestyle='-',\n",
    "        marker='x',\n",
    "        markersize=6,\n",
    "        linewidth=1.5,\n",
    "        label='qᵥ (Mast)'\n",
    "    )\n",
    "\n",
    "    ax_qv.set_xlabel('qᵥ (g/kg)', fontsize=fontsize_labels)\n",
    "    #ax_qv.set_title(f'Profiles at {timestamp}\\nqᵥ', fontsize=fontsize_title)\n",
    "    ax_qv.tick_params(axis='both', labelsize=fontsize_ticks)\n",
    "    ax_qv.grid(True, linestyle='--', alpha=0.4)\n",
    "    ax_qv.set_ylim(alt_min, alt_max)\n",
    "    ax_qv.set_xlim(2, 10)\n",
    "\n",
    "    #ax_qv.set_xlim(4,11)\n",
    "    for spine in ax_qv.spines.values():\n",
    "        spine.set_linewidth(1.2)\n",
    "\n",
    "    ax_qv.legend(loc='upper right', fontsize=fontsize_ticks - 2)\n",
    "\n",
    "    # ────────────────────────────────────────────────────────────────────────\n",
    "    # 10) Add a single shared y‐axis label (for altitude)\n",
    "    #fig.text(\n",
    "    #    0.06, 0.5,\n",
    "    #    'Altitude (m)',\n",
    "     #   va='center',\n",
    "     #   rotation='vertical',\n",
    "     #   fontsize=fontsize_labels\n",
    "   # )\n",
    "    # 14) Thicken axis spines\n",
    "    ax = plt.gca()\n",
    "    for spine in ['left', 'bottom', 'right', 'top']:\n",
    "        ax.spines[spine].set_linewidth(1.2)\n",
    "        # Add parcel-method CBL height to all three subplots (if available)\n",
    "    zi_parcel = row.get('zi_parcel', None)\n",
    "    lcl_qv=    row.get('LCL_qv_T_2.0m',None)\n",
    "    lcl_romps=row.get('LCL_romps_km',None)\n",
    "    if zi_parcel is not None and not pd.isna(zi_parcel):\n",
    "        for ax in [ax_T, ax_RH, ax_qv]:\n",
    "            ax.axhline(zi_parcel, color='magenta', linestyle='--', linewidth=1.5, label='CBL (Parcel)')\n",
    "    if lcl_qv is not None and not pd.isna(lcl_qv):\n",
    "        for ax in [ax_T, ax_RH, ax_qv]:\n",
    "            ax.axhline(lcl_qv*1000, color='brown', linestyle='--', linewidth=1.5, label='LCL (qv)')\n",
    "    if lcl_romps is not None and not pd.isna(lcl_romps):\n",
    "        for ax in [ax_T, ax_RH, ax_qv]:\n",
    "            ax.axhline(lcl_romps*1000, color='green', linestyle='--', linewidth=1.5, label='LCL (Romps)')\n",
    "                   \n",
    "\n",
    "        # Add legend to each subplot (optional, or just one for cleanliness)\n",
    "        ax_T.legend(loc='upper left', fontsize=fontsize_ticks - 8)\n",
    "        ax_RH.legend(loc='upper right', fontsize=fontsize_ticks -8)\n",
    "        ax_AH.legend(loc='right', fontsize=fontsize_ticks -8)\n",
    "\n",
    "        ax_qv.legend(loc='upper right', fontsize=fontsize_ticks - 8)\n",
    "    plt.tight_layout(rect=[0.1, 0.05, 1, 0.95])\n",
    "    plt.show()\n",
    "plot_all_profiles(38,merged_data_10min)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4b488a89",
   "metadata": {},
   "source": [
    "### Bulk relations"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4a9c6ec2",
   "metadata": {
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "# 2) Then write them into a simple one‐page PDF:\n",
    "column_names = merged_data_10min.columns.tolist()\n",
    "\n",
    "fig, ax = plt.subplots(figsize=(8, len(column_names)*0.3))\n",
    "ax.axis('off')\n",
    "\n",
    "for i, col in enumerate(column_names):\n",
    "    ax.text(0.01, 1 - (i+1)*0.03, col, fontsize=12, va='top')\n",
    "\n",
    "#Edit before running!!!magda\n",
    "pdf_path = rf\"C:\\path\\to\\your\\Master_Thesis\\merged_columns_list.pdf\"\n",
    "\n",
    "plt.savefig(pdf_path, format='pdf', bbox_inches='tight')\n",
    "plt.close(fig)\n",
    "\n",
    "print(f\"Saved column list to {pdf_path}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7099ecaf",
   "metadata": {
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "Cd=10e-3\n",
    "merged_data_10min['u_star']=((merged_data_10min['uw_flux_corr']**2)+(merged_data_10min['vw_flux_corr']**2))**0.25\n",
    "merged_data_10min['Cd']=(merged_data_10min['u_star']/merged_data_10min['WS_ms_D15463_Avg'])**2\n",
    "merged_data_10min['SHF_bulk']=merged_data_10min['rho_air_Tv']*Cp*merged_data_10min['Cd']*merged_data_10min['WS_ms_D15463_Avg']*(merged_data_10min['T_srf']-merged_data_10min['Dry_Static_Energy_10'])\n",
    "merged_data_10min['LHF_bulk']=merged_data_10min['rho_air_Tv']*Lv*merged_data_10min['Cd']*merged_data_10min['WS_ms_D15463_Avg']*(merged_data_10min['qsat_srf']-merged_data_10min['qv_10m'])/1000\n",
    "print(merged_data_10min)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "38e08fcc",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Assuming Cp and Lv are defined previously in the code as constants\n",
    "# Calculate the sign agreement for SHF and LHF\n",
    "merged_data_10min['SHF_sign_match'] = np.sign(merged_data_10min['SHF_bulk']) == np.sign(merged_data_10min['SHF'])\n",
    "merged_data_10min['LHF_sign_match'] = np.sign(merged_data_10min['LHF_bulk']) == np.sign(merged_data_10min['LHF'])\n",
    "\n",
    "\n",
    "# Calculate the separate Sign Agreement Percentages for SHF and LHF\n",
    "shf_sign_agreement_percentage = merged_data_10min['SHF_sign_match'].mean() * 100\n",
    "lhf_sign_agreement_percentage = merged_data_10min['LHF_sign_match'].mean() * 100\n",
    "\n",
    "# Output the results\n",
    "print(f\"SHF Sign Agreement Percentage: {shf_sign_agreement_percentage:.2f}%\")\n",
    "print(f\"LHF Sign Agreement Percentage: {lhf_sign_agreement_percentage:.2f}%\")\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d76e9097",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Plot sensible heat flux against time\n",
    "# Ensure TIMESTAMP is in datetime format\n",
    "merged_data_10min['Time'] = pd.to_datetime(merged_data_10min['Time'])\n",
    "merged_data_10min.rename(columns={'Time': 'TIMESTAMP'}, inplace=True)\n",
    "\n",
    "\n",
    "plt.figure(figsize=(10, 6))\n",
    "plt.scatter(merged_data_10min['TIMESTAMP'], merged_data_10min['SHF'], s=10, alpha=0.7,label='SHF')\n",
    "plt.scatter(merged_data_10min['TIMESTAMP'], merged_data_10min['SHF_bulk'], s=10, alpha=0.7,label='SHF_bulk')\n",
    "\n",
    "# Format the x-axis to show time in HH:MM format and set major ticks\n",
    "plt.gca().xaxis.set_major_locator(mdates.HourLocator(interval=2))  # Set ticks every hour\n",
    "plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))\n",
    "\n",
    "# Rotate the x-axis labels for better readability\n",
    "plt.gcf().autofmt_xdate()\n",
    "\n",
    "plt.xlabel('Time')\n",
    "plt.ylabel('Heat Flux (W/m^2)')\n",
    "plt.title('SHF vs Time (averaged in 10 min intervals)')\n",
    "plt.legend()\n",
    "plt.grid(True)\n",
    "#plt.tight_layout()\n",
    "plt.savefig(os.path.join(data_dir, 'shf_bulk_plot_10min.png'),dpi=300)  # Save the plot as a JPEG file\n",
    "\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2d546439",
   "metadata": {
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "# Plot sensible heat flux against time\n",
    "# Ensure TIMESTAMP is in datetime format\n",
    "merged_data_10min['TIMESTAMP'] = pd.to_datetime(merged_data_10min['TIMESTAMP'])\n",
    "plt.figure(figsize=(10, 6))\n",
    "plt.scatter(merged_data_10min['TIMESTAMP'], merged_data_10min['LHF'], s=10, alpha=0.7,label='LHF')\n",
    "plt.scatter(merged_data_10min['TIMESTAMP'], merged_data_10min['LHF_bulk'], s=10, alpha=0.7,label='LHF_bulk')\n",
    "\n",
    "# Format the x-axis to show time in HH:MM format and set major ticks\n",
    "plt.gca().xaxis.set_major_locator(mdates.HourLocator(interval=2))  # Set ticks every hour\n",
    "plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))\n",
    "\n",
    "# Rotate the x-axis labels for better readability\n",
    "plt.gcf().autofmt_xdate()\n",
    "\n",
    "plt.xlabel('Time')\n",
    "plt.ylabel('Heat Flux (W/m^2)')\n",
    "plt.title('LHF vs Time (averaged in 10 min intervals)')\n",
    "plt.legend()\n",
    "plt.grid(True)\n",
    "#plt.tight_layout()\n",
    "plt.savefig(os.path.join(data_dir, 'lhf_bulk_plot_10min.png'),dpi=300)  # Save the plot as a JPEG file\n",
    "\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d41e88a6",
   "metadata": {},
   "source": [
    "### Diurnal plots"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f622f28b",
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "# --- Filter data where Rain == 0 and Flag == 0 ---\n",
    "filtered_df = merged_data_10min[(merged_data_10min.get('Flag_Rain', 0) == 0) & (merged_data_10min.get('Flag', 0) == 0)]\n",
    "filtered_df['TIMESTAMP'] = pd.to_datetime(filtered_df['TIMESTAMP'])\n",
    "filtered_df.rename(columns={'TIMESTAMP': 'Time'}, inplace=True)\n",
    "\n",
    "\n",
    "# Set up the figure and axes\n",
    "fig, ax = plt.subplots(figsize=(10, 6))\n",
    "\n",
    "# Heights dictionary\n",
    "heights = {\n",
    "    'Temperature_K_2': '2 m',\n",
    "    'Temperature_K_2.99': '2.99 m',\n",
    "    'Temperature_K_4.47': '4.47 m',\n",
    "    'Temperature_K_6.69': '6.69 m',\n",
    "    'Temperature_K_10': '10 m'\n",
    "}\n",
    "\n",
    "# Plot temperatures at different mast heights with thicker lines\n",
    "for col, label in heights.items():\n",
    "    ax.plot(\n",
    "        filtered_df['Time'],\n",
    "        filtered_df[col],\n",
    "        label=f'{label}',\n",
    "        linewidth=1.5\n",
    "    )\n",
    "\n",
    "# Plot T_srf (from longwave radiation) with a bolder dashed line\n",
    "ax.plot(\n",
    "    filtered_df['Time'],\n",
    "    filtered_df['T_srf'],\n",
    "    label='Surface T$_{srf}$',\n",
    "    color='black',\n",
    "    linestyle='--',\n",
    "    linewidth=2.0\n",
    ")\n",
    "\n",
    "# --- Formatting Spines (thicker borders) ---\n",
    "for spine in ax.spines.values():\n",
    "    spine.set_linewidth(1.5)\n",
    "\n",
    "# --- Tick parameters ---\n",
    "ax.tick_params(axis='both', which='major', labelsize=18, width=1.5, length=6)\n",
    "ax.tick_params(axis='both', which='minor', width=1.0, length=4)\n",
    "\n",
    "# --- Format x-axis to show every 2 hours in HH:MM ---\n",
    "ax.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))\n",
    "ax.xaxis.set_major_locator(mdates.HourLocator(interval=2))\n",
    "ax.xaxis.set_minor_locator(mdates.MinuteLocator(interval=30))\n",
    "\n",
    "# Rotate x‐tick labels for readability\n",
    "plt.xticks(rotation=45)\n",
    "\n",
    "# --- Labels and title with bold font ---\n",
    "ax.set_title('Diurnal Variation of Air and Surface Temperature', fontsize=20, fontweight='bold')\n",
    "ax.set_xlabel('Time', fontsize=20, fontweight='bold')\n",
    "ax.set_ylabel('Temperature (K)', fontsize=20, fontweight='bold')\n",
    "\n",
    "# --- Grid styling ---\n",
    "ax.grid(True, which='major', linestyle='--', linewidth=0.8, alpha=0.7)\n",
    "ax.grid(True, which='minor', linestyle=':', linewidth=0.5, alpha=0.5)\n",
    "\n",
    "# --- Legend styling ---\n",
    "legend = ax.legend(\n",
    "    title='Measurement Height',\n",
    "    fontsize=14,\n",
    "    title_fontsize=14,\n",
    "    frameon=True\n",
    ")\n",
    "legend.get_frame().set_linewidth(1.5)\n",
    "\n",
    "# --- Y‐axis tick label size ---\n",
    "ax.tick_params(axis='y', labelsize=18)\n",
    "\n",
    "\n",
    "# --- Tight layout and display ---\n",
    "plt.tight_layout()\n",
    "output_filename = 'diurnal_temperature_plot.png'\n",
    "output_path = os.path.join(folder_path, output_filename)\n",
    "plt.savefig(output_path, dpi=300,bbox_inches='tight')\n",
    "\n",
    "# Print confirmation\n",
    "print(f\"Plot saved to: {output_path}\")\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "04ab6b68",
   "metadata": {
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "# Define heights and corresponding variable names\n",
    "heights = ['2', '2.99', '4.47', '6.69', '10']\n",
    "sdry_vars = [f'Dry_Static_Energy_{h}' for h in heights]\n",
    "qv_vars = [f'qv_{h}m' for h in heights]\n",
    "# Mapping of wind speed columns\n",
    "wind_speed_cols = {\n",
    "    'WS_ms_D15008_Avg': '2 m',\n",
    "    'WS_ms_D15014_Avg': '4.47 m',\n",
    "    'WS_ms_D15463_Avg': '10 m'\n",
    "}\n",
    "\n",
    "# 1) Diurnal Variation of Dry Static Energy (s_dry / c_p)\n",
    "# ------------------------------------------------------------\n",
    "fig, ax = plt.subplots(figsize=(10, 6))\n",
    "\n",
    "for h, var in zip(heights, sdry_vars):\n",
    "    ax.plot(\n",
    "        filtered_df['Time'],\n",
    "        filtered_df[var],\n",
    "        label=f'{h} m',\n",
    "        linewidth=1.5\n",
    "    )\n",
    "\n",
    "# Title and labels\n",
    "ax.set_title('Diurnal Variation of Dry Static Energy (s$_{dry}/c_p$)',\n",
    "             fontsize=20, fontweight='bold')\n",
    "ax.set_xlabel('Time', fontsize=20, fontweight='bold')\n",
    "ax.set_ylabel('Dry Static Energy (K)', fontsize=20, fontweight='bold')\n",
    "\n",
    "# Thicken spines\n",
    "for spine in ax.spines.values():\n",
    "    spine.set_linewidth(1.5)\n",
    "\n",
    "# Tick formatting\n",
    "ax.tick_params(axis='both', which='major', labelsize=18, width=1.5, length=6)\n",
    "ax.tick_params(axis='both', which='minor', width=1.0, length=4)\n",
    "\n",
    "# X‐axis date formatting: every 2 hours\n",
    "ax.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))\n",
    "ax.xaxis.set_major_locator(mdates.HourLocator(interval=2))\n",
    "ax.xaxis.set_minor_locator(mdates.MinuteLocator(interval=30))\n",
    "\n",
    "# Grid styling\n",
    "ax.grid(True, which='major', linestyle='--', linewidth=0.8, alpha=0.7)\n",
    "ax.grid(True, which='minor', linestyle=':', linewidth=0.5, alpha=0.5)\n",
    "\n",
    "# Legend\n",
    "legend = ax.legend(title='Measurement Height', fontsize=14, title_fontsize=14, frameon=True)\n",
    "legend.get_frame().set_linewidth(1.5)\n",
    "\n",
    "# Rotate x‐tick labels\n",
    "plt.xticks(rotation=45)\n",
    "\n",
    "plt.tight_layout()\n",
    "\n",
    "# Save figure\n",
    "output_path = os.path.join(folder_path, 'dry_static_energy_diurnal.png')\n",
    "plt.savefig(output_path, dpi=300,bbox_inches='tight')\n",
    "print(f\"Plot saved to: {output_path}\")\n",
    "\n",
    "plt.show()\n",
    "\n",
    "\n",
    "# ------------------------------------------------------------\n",
    "# 2) Diurnal Variation of Specific Humidity (q_v)\n",
    "# ------------------------------------------------------------\n",
    "fig, ax = plt.subplots(figsize=(10, 6))\n",
    "\n",
    "for h, var in zip(heights, qv_vars):\n",
    "    ax.plot(\n",
    "        filtered_df['Time'],\n",
    "        filtered_df[var],\n",
    "        label=f'{h} m',\n",
    "        linewidth=1.5\n",
    "    )\n",
    "\n",
    "# Title and labels\n",
    "ax.set_title('Diurnal Variation of Specific Humidity (q$_v$)',\n",
    "             fontsize=20, fontweight='bold')\n",
    "ax.set_xlabel('Time', fontsize=20, fontweight='bold')\n",
    "ax.set_ylabel('Specific Humidity (g/kg)', fontsize=20, fontweight='bold')\n",
    "\n",
    "# Thicken spines\n",
    "for spine in ax.spines.values():\n",
    "    spine.set_linewidth(1.5)\n",
    "\n",
    "# Tick formatting\n",
    "ax.tick_params(axis='both', which='major', labelsize=18, width=1.5, length=6)\n",
    "ax.tick_params(axis='both', which='minor', width=1.0, length=4)\n",
    "\n",
    "# X‐axis date formatting: every 2 hours\n",
    "ax.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))\n",
    "ax.xaxis.set_major_locator(mdates.HourLocator(interval=2))\n",
    "ax.xaxis.set_minor_locator(mdates.MinuteLocator(interval=30))\n",
    "\n",
    "# Grid styling\n",
    "ax.grid(True, which='major', linestyle='--', linewidth=0.8, alpha=0.7)\n",
    "ax.grid(True, which='minor', linestyle=':', linewidth=0.5, alpha=0.5)\n",
    "\n",
    "# Legend\n",
    "legend = ax.legend(title='Measurement Height', fontsize=14, title_fontsize=14, frameon=True)\n",
    "legend.get_frame().set_linewidth(1.5)\n",
    "\n",
    "# Rotate x‐tick labels\n",
    "plt.xticks(rotation=45)\n",
    "\n",
    "plt.tight_layout()\n",
    "\n",
    "# Save figure\n",
    "output_path = os.path.join(folder_path, 'specific_humidity_diurnal.png')\n",
    "plt.savefig(output_path, dpi=300,bbox_inches='tight')\n",
    "print(f\"Plot saved to: {output_path}\")\n",
    "\n",
    "plt.show()\n",
    "\n",
    "\n",
    "# ------------------------------------------------------------\n",
    "# 3) Diurnal Variation of Virtual Dry Static Energy (s_{dry,v}/c_p)\n",
    "# ------------------------------------------------------------\n",
    "fig, ax = plt.subplots(figsize=(10, 6))\n",
    "\n",
    "for h in heights:\n",
    "    col = f'Virtual_Dry_Static_Energy_{h}'\n",
    "    ax.plot(\n",
    "        filtered_df['Time'],\n",
    "        filtered_df[col],\n",
    "        label=f'{h} m',\n",
    "        linewidth=1.5\n",
    "    )\n",
    "\n",
    "# Title and labels\n",
    "ax.set_title('Diurnal Variation of Virtual Dry Static Energy (s$_{dry,v}/c_p$)',\n",
    "             fontsize=20, fontweight='bold')\n",
    "ax.set_xlabel('Time', fontsize=20, fontweight='bold')\n",
    "ax.set_ylabel('Virtual Dry Static Energy (K)', fontsize=20, fontweight='bold')\n",
    "\n",
    "# Thicken spines\n",
    "for spine in ax.spines.values():\n",
    "    spine.set_linewidth(1.5)\n",
    "\n",
    "# Tick formatting\n",
    "ax.tick_params(axis='both', which='major', labelsize=18, width=1.5, length=6)\n",
    "ax.tick_params(axis='both', which='minor', width=1.0, length=4)\n",
    "\n",
    "# X‐axis date formatting: every 2 hours\n",
    "ax.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))\n",
    "ax.xaxis.set_major_locator(mdates.HourLocator(interval=2))\n",
    "ax.xaxis.set_minor_locator(mdates.MinuteLocator(interval=30))\n",
    "\n",
    "# Grid styling\n",
    "ax.grid(True, which='major', linestyle='--', linewidth=0.8, alpha=0.7)\n",
    "ax.grid(True, which='minor', linestyle=':', linewidth=0.5, alpha=0.5)\n",
    "\n",
    "# Legend\n",
    "legend = ax.legend(title='Measurement Height', fontsize=14, title_fontsize=14, frameon=True)\n",
    "legend.get_frame().set_linewidth(1.5)\n",
    "\n",
    "# Rotate x‐tick labels\n",
    "plt.xticks(rotation=45)\n",
    "\n",
    "plt.tight_layout()\n",
    "\n",
    "# Save figure\n",
    "output_path = os.path.join(folder_path, 'virtual_dry_static_energy_diurnal.png')\n",
    "plt.savefig(output_path, dpi=300,bbox_inches='tight')\n",
    "print(f\"Plot saved to: {output_path}\")\n",
    "\n",
    "plt.show()\n",
    "\n",
    "\n",
    "# ------------------------------------------------------------\n",
    "# 4) Diurnal Variation of Wind Speed at Different Heights\n",
    "# ------------------------------------------------------------\n",
    "fig, ax = plt.subplots(figsize=(10, 6))\n",
    "\n",
    "for col, label in wind_speed_cols.items():\n",
    "    ax.plot(\n",
    "        filtered_df['Time'],\n",
    "        filtered_df[col],\n",
    "        label=f'{label}',\n",
    "        linewidth=1.5\n",
    "    )\n",
    "\n",
    "# Title and labels\n",
    "ax.set_title('Diurnal Variation of Wind Speed',\n",
    "             fontsize=20, fontweight='bold')\n",
    "ax.set_xlabel('Time', fontsize=20, fontweight='bold')\n",
    "ax.set_ylabel('Wind Speed (m/s)', fontsize=20, fontweight='bold')\n",
    "\n",
    "# Thicken spines\n",
    "for spine in ax.spines.values():\n",
    "    spine.set_linewidth(1.5)\n",
    "\n",
    "# Tick formatting\n",
    "ax.tick_params(axis='both', which='major', labelsize=18, width=1.5, length=6)\n",
    "ax.tick_params(axis='both', which='minor', width=1.0, length=4)\n",
    "\n",
    "# X‐axis date formatting: every 2 hours\n",
    "ax.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))\n",
    "ax.xaxis.set_major_locator(mdates.HourLocator(interval=2))\n",
    "ax.xaxis.set_minor_locator(mdates.MinuteLocator(interval=30))\n",
    "\n",
    "# Grid styling\n",
    "ax.grid(True, which='major', linestyle='--', linewidth=0.8, alpha=0.7)\n",
    "ax.grid(True, which='minor', linestyle=':', linewidth=0.5, alpha=0.5)\n",
    "\n",
    "# Legend\n",
    "legend = ax.legend(title='Measurement Height', fontsize=14, title_fontsize=14, frameon=True)\n",
    "legend.get_frame().set_linewidth(1.5)\n",
    "\n",
    "# Rotate x‐tick labels\n",
    "plt.xticks(rotation=45)\n",
    "\n",
    "plt.tight_layout()\n",
    "\n",
    "# Save figure\n",
    "output_path = os.path.join(folder_path, 'wind_speed_diurnal.png')\n",
    "plt.savefig(output_path, dpi=300,bbox_inches='tight')\n",
    "print(f\"Plot saved to: {output_path}\")\n",
    "\n",
    "plt.show()\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "64dddb33",
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "# Define measurement heights in meters\n",
    "height_levels = [2, 2.99, 4.47, 6.69, 10]\n",
    "height_pairs = list(zip(height_levels[:-1], height_levels[1:]))\n",
    "\n",
    "# Calculate vertical gradients (lapse rates) of s_{dry,v}/c_p between levels\n",
    "for h1, h2 in height_pairs:\n",
    "    col1 = f'Virtual_Dry_Static_Energy_{h1}'\n",
    "    col2 = f'Virtual_Dry_Static_Energy_{h2}'\n",
    "    lapse_col = f'Lapse_sdryv_{h1}-{h2}_Kperm'\n",
    "    \n",
    "    dz = h2 - h1  # height difference in meters\n",
    "    filtered_df[lapse_col] = (filtered_df[col2] - filtered_df[col1]) / dz\n",
    "\n",
    "# Plot the lapse rates\n",
    "fig, ax = plt.subplots(figsize=(10, 6))\n",
    "\n",
    "for h1, h2 in height_pairs:\n",
    "    lapse_col = f'Lapse_sdryv_{h1}-{h2}_Kperm'\n",
    "    ax.plot(filtered_df['Time'], filtered_df[lapse_col], label=f'{h1}-{h2} m', linewidth=1.5)\n",
    "\n",
    "# Formatting\n",
    "ax.set_title('Vertical Gradient of Virtual Dry Static Energy (ds$_{dry,v}$/dz)', fontsize=20, fontweight='bold')\n",
    "ax.set_xlabel('Time', fontsize=20, fontweight='bold')\n",
    "ax.set_ylabel('Gradient (K/m)', fontsize=20, fontweight='bold')\n",
    "\n",
    "# Thicken spines\n",
    "for spine in ax.spines.values():\n",
    "    spine.set_linewidth(1.5)\n",
    "ax.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))\n",
    "ax.xaxis.set_major_locator(mdates.HourLocator(interval=2))\n",
    "ax.xaxis.set_minor_locator(mdates.MinuteLocator(interval=30))\n",
    "ax.tick_params(axis='both', which='major', labelsize=18, width=1.5, length=6)\n",
    "ax.grid(True, which='major', linestyle='--', linewidth=0.8, alpha=0.7)\n",
    "ax.legend(title='Layer', fontsize=14, title_fontsize=14, frameon=True)\n",
    "plt.xticks(rotation=45)\n",
    "plt.tight_layout()\n",
    "\n",
    "# Save figure\n",
    "output_path = os.path.join(folder_path, 'lapse_rate_sdryv_diurnal.png')\n",
    "plt.savefig(output_path, dpi=300, bbox_inches='tight')\n",
    "print(f\"Plot saved to: {output_path}\")\n",
    "\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e4d86a57",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Define measurement heights in meters\n",
    "height_levels = [2, 2.99, 4.47, 6.69, 10]\n",
    "height_pairs = list(zip(height_levels[:-1], height_levels[1:]))\n",
    "\n",
    "# Calculate vertical gradients (lapse rates) of s_{dry,v}/c_p between levels\n",
    "for h1, h2 in height_pairs:\n",
    "    col1 = f'Virtual_Dry_Static_Energy_{h1}'\n",
    "    col2 = f'Virtual_Dry_Static_Energy_{h2}'\n",
    "    lapse_col = f'Lapse_sdryv_{h1}-{h2}_Kperm'\n",
    "    \n",
    "    dz = h2 - h1  # height difference in meters\n",
    "    filtered_df[lapse_col] = (filtered_df[col2] - filtered_df[col1]) / dz\n",
    "\n",
    "# Plot the lapse rates\n",
    "fig, ax = plt.subplots(figsize=(10, 6))\n",
    "\n",
    "for h1, h2 in height_pairs:\n",
    "    lapse_col = f'Lapse_sdryv_{h1}-{h2}_Kperm'\n",
    "\n",
    "    # Plot curve and grab its assigned color\n",
    "    line, = ax.plot(filtered_df['Time'], filtered_df[lapse_col],\n",
    "                    label=f'{h1}-{h2} m', linewidth=1.5)\n",
    "    color = line.get_color()\n",
    "\n",
    "    # ---- Find zero-crossings for this gradient ----\n",
    "    grad = filtered_df[lapse_col].to_numpy()\n",
    "    time_vals = filtered_df['Time'].to_numpy()\n",
    "    sign = np.sign(grad)\n",
    "\n",
    "    # zero-crossings = where sign changes between consecutive points\n",
    "    zc_idx = np.where(sign[1:] * sign[:-1] < 0)[0] + 1\n",
    "    zc_times = time_vals[zc_idx]\n",
    "\n",
    "    # Draw vertical lines in the same color as the curve\n",
    "    for t in zc_times:\n",
    "        ax.axvline(t, color=color, linestyle='--', alpha=0.9, linewidth=1)\n",
    "\n",
    "# Formatting\n",
    "ax.set_title('Vertical Gradient of Virtual Dry Static Energy (ds$_{dry,v}$/dz)', fontsize=20, fontweight='bold')\n",
    "ax.set_xlabel('Time', fontsize=20, fontweight='bold')\n",
    "ax.set_ylabel('Gradient (K/m)', fontsize=20, fontweight='bold')\n",
    "\n",
    "# Thicken spines\n",
    "for spine in ax.spines.values():\n",
    "    spine.set_linewidth(1.5)\n",
    "\n",
    "ax.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))\n",
    "ax.xaxis.set_major_locator(mdates.HourLocator(interval=2))\n",
    "ax.xaxis.set_minor_locator(mdates.MinuteLocator(interval=30))\n",
    "ax.tick_params(axis='both', which='major', labelsize=18, width=1.5, length=6)\n",
    "ax.grid(True, which='major', linestyle='--', linewidth=0.8, alpha=0.7)\n",
    "\n",
    "ax.legend(title='Layer', fontsize=14, title_fontsize=14, frameon=True)\n",
    "plt.xticks(rotation=45)\n",
    "plt.tight_layout()\n",
    "\n",
    "# Save figure\n",
    "output_path = os.path.join(folder_path, 'lapse_rate_sdryv_diurnal.png')\n",
    "plt.savefig(output_path, dpi=300, bbox_inches='tight')\n",
    "print(f\"Plot saved to: {output_path}\")\n",
    "\n",
    "plt.show()\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5e4d465b",
   "metadata": {
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "# 1) Longwave Radiation (LW) Over Time\n",
    "# ------------------------------------------------------------\n",
    "fig, ax = plt.subplots(figsize=(10, 6))\n",
    "\n",
    "# Plot downward, upward, and net LW with thicker lines\n",
    "ax.plot(\n",
    "    filtered_df['Time'],\n",
    "    filtered_df['IR20Dn'],\n",
    "    label='LW↓ (Downward)',\n",
    "    color='blue',\n",
    "    linewidth=2.0\n",
    ")\n",
    "ax.plot(\n",
    "    filtered_df['Time'],\n",
    "    filtered_df['IR20Up'],\n",
    "    label='LW↑ (Upward)',\n",
    "    color='orange',\n",
    "    linewidth=2.0\n",
    ")\n",
    "ax.plot(\n",
    "    filtered_df['Time'],\n",
    "    filtered_df['NetRl'],\n",
    "    label='LW$_{net}$ (Net)',\n",
    "    color='green',\n",
    "    linewidth=2.0\n",
    ")\n",
    "\n",
    "# Title and labels in bold\n",
    "ax.set_title('Longwave Radiation (LW) vs. Time', fontsize=20, fontweight='bold')\n",
    "ax.set_xlabel('Time', fontsize=20, fontweight='bold')\n",
    "ax.set_ylabel('Radiation (W/m²)', fontsize=20, fontweight='bold')\n",
    "\n",
    "# Thicken all spines\n",
    "for spine in ax.spines.values():\n",
    "    spine.set_linewidth(1.5)\n",
    "\n",
    "# Tick parameters for major and minor ticks\n",
    "ax.tick_params(axis='both', which='major', labelsize=18, width=1.5, length=6)\n",
    "ax.tick_params(axis='both', which='minor', width=1.0, length=4)\n",
    "\n",
    "# X‐axis formatting: show HH:MM every 2 hours, minor ticks every 30 minutes\n",
    "ax.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))\n",
    "ax.xaxis.set_major_locator(mdates.HourLocator(interval=2))\n",
    "ax.xaxis.set_minor_locator(mdates.MinuteLocator(interval=30))\n",
    "\n",
    "# Rotate x‐tick labels for readability\n",
    "plt.xticks(rotation=45)\n",
    "\n",
    "# Grid styling: dashed major, dotted minor\n",
    "ax.grid(True, which='major', linestyle='--', linewidth=0.8, alpha=0.7)\n",
    "ax.grid(True, which='minor', linestyle=':', linewidth=0.5, alpha=0.5)\n",
    "\n",
    "# Legend styling\n",
    "legend = ax.legend(fontsize=14, title='LW Components', title_fontsize=14, frameon=True)\n",
    "legend.get_frame().set_linewidth(1.5)\n",
    "\n",
    "# Y‐axis tick label size\n",
    "ax.tick_params(axis='y', labelsize=18)\n",
    "\n",
    "plt.tight_layout()\n",
    "\n",
    "# Save figure\n",
    "lw_output = os.path.join(folder_path, 'longwave_radiation_diurnal.png')\n",
    "plt.savefig(lw_output, dpi=300,bbox_inches='tight')\n",
    "print(f\"Plot saved to: {lw_output}\")\n",
    "\n",
    "plt.show()\n",
    "\n",
    "\n",
    "# ------------------------------------------------------------\n",
    "# 2) Shortwave Radiation (SW) Over Time\n",
    "# ------------------------------------------------------------\n",
    "fig, ax = plt.subplots(figsize=(10, 6))\n",
    "\n",
    "# Plot downward, upward, and net SW with thicker lines\n",
    "ax.plot(\n",
    "    filtered_df['Time'],\n",
    "    filtered_df['SR15D1Dn_Irr'],\n",
    "    label='SW↓ (Downward)',\n",
    "    color='purple',\n",
    "    linewidth=2.0\n",
    ")\n",
    "ax.plot(\n",
    "    filtered_df['Time'],\n",
    "    filtered_df['SR15D1Up_Irr'],\n",
    "    label='SW↑ (Upward)',\n",
    "    color='red',\n",
    "    linewidth=2.0\n",
    ")\n",
    "ax.plot(\n",
    "    filtered_df['Time'],\n",
    "    filtered_df['NetRs'],\n",
    "    label='SW$_{net}$ (Net)',\n",
    "    color='darkgreen',\n",
    "    linewidth=2.0\n",
    ")\n",
    "\n",
    "# Title and labels in bold\n",
    "ax.set_title('Shortwave Radiation (SW) vs. Time', fontsize=20, fontweight='bold')\n",
    "ax.set_xlabel('Time', fontsize=20, fontweight='bold')\n",
    "ax.set_ylabel('Radiation (W/m²)', fontsize=20, fontweight='bold')\n",
    "\n",
    "# Thicken all spines\n",
    "for spine in ax.spines.values():\n",
    "    spine.set_linewidth(1.5)\n",
    "\n",
    "# Tick parameters\n",
    "ax.tick_params(axis='both', which='major', labelsize=18, width=1.5, length=6)\n",
    "ax.tick_params(axis='both', which='minor', width=1.0, length=4)\n",
    "\n",
    "# X‐axis formatting: every 2 hours\n",
    "ax.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))\n",
    "ax.xaxis.set_major_locator(mdates.HourLocator(interval=2))\n",
    "ax.xaxis.set_minor_locator(mdates.MinuteLocator(interval=30))\n",
    "\n",
    "# Rotate x‐tick labels\n",
    "plt.xticks(rotation=45)\n",
    "\n",
    "# Grid styling\n",
    "ax.grid(True, which='major', linestyle='--', linewidth=0.8, alpha=0.7)\n",
    "ax.grid(True, which='minor', linestyle=':', linewidth=0.5, alpha=0.5)\n",
    "\n",
    "# Legend styling\n",
    "legend = ax.legend(fontsize=14, title='SW Components', title_fontsize=14, frameon=True)\n",
    "legend.get_frame().set_linewidth(1.5)\n",
    "\n",
    "# Y‐axis tick label size\n",
    "ax.tick_params(axis='y', labelsize=18)\n",
    "\n",
    "plt.tight_layout()\n",
    "\n",
    "# Save figure\n",
    "sw_output = os.path.join(folder_path, 'shortwave_radiation_diurnal.png')\n",
    "plt.savefig(sw_output, dpi=300,bbox_inches='tight')\n",
    "print(f\"Plot saved to: {sw_output}\")\n",
    "\n",
    "plt.show()\n",
    "# 3)Radiation Over Time\n",
    "# ------------------------------------------------------------\n",
    "fig, ax = plt.subplots(figsize=(10, 6))\n",
    "\n",
    "# Scatter for measured Net_Rad (negative)\n",
    "ax.plot(\n",
    "    filtered_df['Time'],\n",
    "    filtered_df['Net_Radiation_10min'],\n",
    "    label='F$_{net}$ (Net)',\n",
    "\n",
    "    color='red',\n",
    "    linewidth=2.0\n",
    ")\n",
    "ax.plot(\n",
    "    filtered_df['Time'],\n",
    "    filtered_df['NetRl'],\n",
    "    label='LW$_{net}$ (Net)',\n",
    "    color='green',\n",
    "    linewidth=2.0\n",
    ")\n",
    "ax.plot(\n",
    "    filtered_df['Time'],\n",
    "    filtered_df['NetRs'],\n",
    "    label='SW$_{net}$ (Net)',\n",
    "    color='blue',\n",
    "    linewidth=2.0\n",
    ")\n",
    "\n",
    "# Title and labels in bold\n",
    "ax.set_title('Net Radiation vs. Time', fontsize=20, fontweight='bold')\n",
    "ax.set_xlabel('Time', fontsize=20, fontweight='bold')\n",
    "ax.set_ylabel('Radiation (W/m²)', fontsize=20, fontweight='bold')\n",
    "\n",
    "# Thicken all spines\n",
    "for spine in ax.spines.values():\n",
    "    spine.set_linewidth(1.5)\n",
    "\n",
    "# Tick parameters\n",
    "ax.tick_params(axis='both', which='major', labelsize=18, width=1.5, length=6)\n",
    "ax.tick_params(axis='both', which='minor', width=1.0, length=4)\n",
    "\n",
    "# X‐axis formatting: every 2 hours\n",
    "ax.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))\n",
    "ax.xaxis.set_major_locator(mdates.HourLocator(interval=2))\n",
    "ax.xaxis.set_minor_locator(mdates.MinuteLocator(interval=30))\n",
    "\n",
    "# Rotate x‐tick labels\n",
    "plt.xticks(rotation=45)\n",
    "\n",
    "# Grid styling\n",
    "ax.grid(True, which='major', linestyle='--', linewidth=0.8, alpha=0.7)\n",
    "ax.grid(True, which='minor', linestyle=':', linewidth=0.5, alpha=0.5)\n",
    "\n",
    "# Legend styling\n",
    "legend = ax.legend(fontsize=14, title='Components', title_fontsize=14, frameon=True)\n",
    "legend.get_frame().set_linewidth(1.5)\n",
    "\n",
    "# Y‐axis tick label size\n",
    "ax.tick_params(axis='y', labelsize=18)\n",
    "\n",
    "plt.tight_layout()\n",
    "\n",
    "# Save figure\n",
    "sw_output = os.path.join(folder_path, 'radiation_diurnal.png')\n",
    "plt.savefig(sw_output, dpi=300,bbox_inches='tight')\n",
    "print(f\"Plot saved to: {sw_output}\")\n",
    "\n",
    "plt.show()\n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "87abbd8f",
   "metadata": {},
   "outputs": [],
   "source": [
    "#print(filtered_df['tau_xz_corr'])\n",
    "#tau_yz_corr\n",
    "#tau_xy_corr'])\n",
    "# 1) Diurnal Variation of Sensible Heat Flux (SHF)\n",
    "# ------------------------------------------------------------\n",
    "fig, ax = plt.subplots(figsize=(10, 6))\n",
    "\n",
    "# Plot measured SHF\n",
    "ax.plot(\n",
    "    filtered_df['Time'],\n",
    "    filtered_df['tau_xz_corr'],\n",
    "    marker='o',\n",
    "    linestyle='-',\n",
    "    markersize=4,\n",
    "    linewidth=1.5,\n",
    "    alpha=0.8,\n",
    "    color='green',\n",
    "    label='${\\\\tau}_{xz}$'\n",
    ")\n",
    "\n",
    "# Plot bulk‐estimate SHF\n",
    "ax.plot(\n",
    "    filtered_df['Time'],\n",
    "    filtered_df['tau_yz_corr'],\n",
    "    marker='s',\n",
    "    linestyle='--',\n",
    "    markersize=4,\n",
    "    linewidth=1.5,\n",
    "    alpha=0.8,\n",
    "    color='red',\n",
    "    label='${\\\\tau}_{yz}$'\n",
    ")\n",
    "# Plot bulk‐estimate SHF\n",
    "ax.plot(\n",
    "    filtered_df['Time'],\n",
    "    filtered_df['tau_xy_corr'],\n",
    "    marker='x',\n",
    "    linestyle='--',\n",
    "    markersize=4,\n",
    "    linewidth=1.5,\n",
    "    alpha=0.8,\n",
    "    color='blue',\n",
    "    label='${\\\\tau_{xy}}$'\n",
    ")\n",
    "# Title and labels in bold\n",
    "ax.set_title('Diurnal Variation of Momentum Flux', fontsize=20, fontweight='bold')\n",
    "ax.set_xlabel('Time', fontsize=20, fontweight='bold')\n",
    "ax.set_ylabel('Shear stress (Pa)', fontsize=20, fontweight='bold')\n",
    "\n",
    "# Thicken spines\n",
    "for spine in ax.spines.values():\n",
    "    spine.set_linewidth(1.5)\n",
    "\n",
    "# Tick parameters\n",
    "ax.tick_params(axis='both', which='major', labelsize=18, width=1.5, length=6)\n",
    "ax.tick_params(axis='both', which='minor', width=1.0, length=4)\n",
    "\n",
    "# X‐axis formatting: every 2 hours\n",
    "ax.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))\n",
    "ax.xaxis.set_major_locator(mdates.HourLocator(interval=2))\n",
    "ax.xaxis.set_minor_locator(mdates.MinuteLocator(interval=30))\n",
    "\n",
    "# Rotate x‐tick labels\n",
    "plt.xticks(rotation=45)\n",
    "\n",
    "# Grid styling\n",
    "ax.grid(True, which='major', linestyle='--', linewidth=0.8, alpha=0.7)\n",
    "ax.grid(True, which='minor', linestyle=':', linewidth=0.5, alpha=0.5)\n",
    "\n",
    "# Legend styling\n",
    "legend = ax.legend(loc='upper left', fontsize=14, frameon=True)\n",
    "legend.get_frame().set_linewidth(1.5)\n",
    "\n",
    "# Show and save\n",
    "plt.tight_layout()\n",
    "shf_path = os.path.join(folder_path, 'diurnal_tau.png')\n",
    "plt.savefig(shf_path, dpi=300,bbox_inches='tight')\n",
    "print(f\"Plot saved to: {shf_path}\")\n",
    "plt.show()\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9c217935",
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "\n",
    "# Assuming Cp and Lv are defined previously in the code as constants\n",
    "# Calculate the sign agreement for SHF and LHF\n",
    "filtered_df['SHF_sign_match_again'] = np.sign(filtered_df['SHF_bulk']) == np.sign(filtered_df['SHF'])\n",
    "filtered_df['LHF_sign_match_again'] = np.sign(filtered_df['LHF_bulk']) == np.sign(filtered_df['LHF'])\n",
    "\n",
    "\n",
    "# Calculate the separate Sign Agreement Percentages for SHF and LHF\n",
    "shf_sign_agreement_percentage = filtered_df['SHF_sign_match'].mean() * 100\n",
    "lhf_sign_agreement_percentage = filtered_df['LHF_sign_match'].mean() * 100\n",
    "\n",
    "# Output the results\n",
    "print(f\"SHF Sign Agreement Percentage: {shf_sign_agreement_percentage:.2f}%\")\n",
    "print(f\"LHF Sign Agreement Percentage: {lhf_sign_agreement_percentage:.2f}%\")\n",
    "\n",
    "# 1) Diurnal Variation of Sensible Heat Flux (SHF)\n",
    "# ------------------------------------------------------------\n",
    "fig, ax = plt.subplots(figsize=(10, 6))\n",
    "\n",
    "# Plot measured SHF\n",
    "ax.plot(\n",
    "    filtered_df['Time'],\n",
    "    filtered_df['SHF'],\n",
    "    marker='o',\n",
    "    linestyle='-',\n",
    "    markersize=4,\n",
    "    linewidth=1.5,\n",
    "    alpha=0.8,\n",
    "    color='darkorange',\n",
    "    label='SHF (measured)'\n",
    ")\n",
    "\n",
    "# Plot bulk‐estimate SHF\n",
    "ax.plot(\n",
    "    filtered_df['Time'],\n",
    "    filtered_df['SHF_bulk'],\n",
    "    marker='s',\n",
    "    linestyle='--',\n",
    "    markersize=4,\n",
    "    linewidth=1.5,\n",
    "    alpha=0.8,\n",
    "    color='gray',\n",
    "    label='SHF (bulk estimate)'\n",
    ")\n",
    "\n",
    "# Title and labels in bold\n",
    "ax.set_title('Diurnal Variation of Sensible Heat Flux', fontsize=20, fontweight='bold')\n",
    "ax.set_xlabel('Time', fontsize=20, fontweight='bold')\n",
    "ax.set_ylabel('Sensible Heat Flux (W/m²)', fontsize=20, fontweight='bold')\n",
    "\n",
    "# Thicken spines\n",
    "for spine in ax.spines.values():\n",
    "    spine.set_linewidth(1.5)\n",
    "\n",
    "# Tick parameters\n",
    "ax.tick_params(axis='both', which='major', labelsize=18, width=1.5, length=6)\n",
    "ax.tick_params(axis='both', which='minor', width=1.0, length=4)\n",
    "\n",
    "# X‐axis formatting: every 2 hours\n",
    "ax.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))\n",
    "ax.xaxis.set_major_locator(mdates.HourLocator(interval=2))\n",
    "ax.xaxis.set_minor_locator(mdates.MinuteLocator(interval=30))\n",
    "\n",
    "# Rotate x‐tick labels\n",
    "plt.xticks(rotation=45)\n",
    "\n",
    "# Grid styling\n",
    "ax.grid(True, which='major', linestyle='--', linewidth=0.8, alpha=0.7)\n",
    "ax.grid(True, which='minor', linestyle=':', linewidth=0.5, alpha=0.5)\n",
    "\n",
    "# Legend styling\n",
    "legend = ax.legend(loc='upper left', fontsize=14, frameon=True)\n",
    "legend.get_frame().set_linewidth(1.5)\n",
    "\n",
    "# Show and save\n",
    "plt.tight_layout()\n",
    "shf_path = os.path.join(folder_path, 'diurnal_shf.png')\n",
    "plt.savefig(shf_path, dpi=300,bbox_inches='tight')\n",
    "print(f\"Plot saved to: {shf_path}\")\n",
    "plt.show()\n",
    "\n",
    "\n",
    "# ------------------------------------------------------------\n",
    "# 2) Diurnal Variation of Latent Heat Flux (LHF)\n",
    "# ------------------------------------------------------------\n",
    "fig, ax = plt.subplots(figsize=(10, 6))\n",
    "\n",
    "# Plot measured LHF\n",
    "ax.plot(\n",
    "    filtered_df['Time'],\n",
    "    filtered_df['LHF'],\n",
    "    marker='o',\n",
    "    linestyle='-',\n",
    "    markersize=4,\n",
    "    linewidth=1.5,\n",
    "    alpha=0.8,\n",
    "    color='lightskyblue',\n",
    "    label='LHF (measured)'\n",
    ")\n",
    "\n",
    "# Plot bulk‐estimate LHF\n",
    "ax.plot(\n",
    "    filtered_df['Time'],\n",
    "    filtered_df['LHF_bulk'],\n",
    "    marker='s',\n",
    "    linestyle='--',\n",
    "    markersize=4,\n",
    "    linewidth=1.5,\n",
    "    alpha=0.8,\n",
    "    color='gray',\n",
    "    label='LHF (bulk estimate)'\n",
    ")\n",
    "\n",
    "# Title and labels in bold\n",
    "ax.set_title('Diurnal Variation of Latent Heat Flux', fontsize=20, fontweight='bold')\n",
    "ax.set_xlabel('Time', fontsize=20, fontweight='bold')\n",
    "ax.set_ylabel('Latent Heat Flux (W/m²)', fontsize=20, fontweight='bold')\n",
    "\n",
    "# Thicken spines\n",
    "for spine in ax.spines.values():\n",
    "    spine.set_linewidth(1.5)\n",
    "\n",
    "# Tick parameters\n",
    "ax.tick_params(axis='both', which='major', labelsize=18, width=1.5, length=6)\n",
    "ax.tick_params(axis='both', which='minor', width=1.0, length=4)\n",
    "\n",
    "# X‐axis formatting: every 2 hours\n",
    "ax.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))\n",
    "ax.xaxis.set_major_locator(mdates.HourLocator(interval=2))\n",
    "ax.xaxis.set_minor_locator(mdates.MinuteLocator(interval=30))\n",
    "\n",
    "# Rotate x‐tick labels\n",
    "plt.xticks(rotation=45)\n",
    "\n",
    "# Grid styling\n",
    "ax.grid(True, which='major', linestyle='--', linewidth=0.8, alpha=0.7)\n",
    "ax.grid(True, which='minor', linestyle=':', linewidth=0.5, alpha=0.5)\n",
    "\n",
    "# Legend styling\n",
    "legend = ax.legend(loc='upper left', fontsize=14,frameon=True)\n",
    "legend.get_frame().set_linewidth(1.5)\n",
    "\n",
    "# Show and save\n",
    "plt.tight_layout()\n",
    "lhf_path = os.path.join(folder_path, 'diurnal_lhf.png')\n",
    "plt.savefig(lhf_path, dpi=300,bbox_inches='tight')\n",
    "print(f\"Plot saved to: {lhf_path}\")\n",
    "plt.show()\n",
    "\n",
    "\n",
    "# ------------------------------------------------------------\n",
    "# 3) Combined Diurnal Variation of SHF and LHF\n",
    "# ------------------------------------------------------------\n",
    "fig, ax = plt.subplots(figsize=(10, 6))\n",
    "\n",
    "# Plot LHF (measured)\n",
    "ax.plot(\n",
    "    filtered_df['Time'],\n",
    "    filtered_df['LHF'],\n",
    "    marker='o',\n",
    "    linestyle='-',\n",
    "    markersize=4,\n",
    "    linewidth=1.5,\n",
    "    alpha=0.8,\n",
    "    color='lightskyblue',\n",
    "    label='LHF (measured)'\n",
    ")\n",
    "\n",
    "# Plot SHF (measured)\n",
    "ax.plot(\n",
    "    filtered_df['Time'],\n",
    "    filtered_df['SHF'],\n",
    "    marker='s',\n",
    "    linestyle='-',\n",
    "    markersize=4,\n",
    "    linewidth=1.5,\n",
    "    alpha=0.8,\n",
    "    color='darkorange',\n",
    "    label='SHF (measured)'\n",
    ")\n",
    "\n",
    "# Title and labels\n",
    "ax.set_title('Diurnal Variation of SHF and LHF', fontsize=20, fontweight='bold')\n",
    "ax.set_xlabel('Time', fontsize=20, fontweight='bold')\n",
    "ax.set_ylabel('Heat Flux (W/m²)', fontsize=20, fontweight='bold')\n",
    "\n",
    "# Thicken spines\n",
    "for spine in ax.spines.values():\n",
    "    spine.set_linewidth(1.5)\n",
    "\n",
    "# Tick parameters\n",
    "ax.tick_params(axis='both', which='major', labelsize=18, width=1.5, length=6)\n",
    "ax.tick_params(axis='both', which='minor', width=1.0, length=4)\n",
    "\n",
    "# X‐axis formatting: every 2 hours\n",
    "ax.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))\n",
    "ax.xaxis.set_major_locator(mdates.HourLocator(interval=2))\n",
    "ax.xaxis.set_minor_locator(mdates.MinuteLocator(interval=30))\n",
    "\n",
    "# Rotate x‐tick labels\n",
    "plt.xticks(rotation=45)\n",
    "\n",
    "# Grid styling\n",
    "ax.grid(True, which='major', linestyle='--', linewidth=0.8, alpha=0.7)\n",
    "ax.grid(True, which='minor', linestyle=':', linewidth=0.5, alpha=0.5)\n",
    "\n",
    "# Legend styling\n",
    "legend = ax.legend(loc='upper left', fontsize=14, frameon=True)\n",
    "legend.get_frame().set_linewidth(1.5)\n",
    "\n",
    "# Show and save\n",
    "plt.tight_layout()\n",
    "combined_path = os.path.join(folder_path, 'diurnal_shf_lhf_combined.png')\n",
    "plt.savefig(combined_path, dpi=300,bbox_inches='tight')\n",
    "print(f\"Plot saved to: {combined_path}\")\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b45427f5",
   "metadata": {},
   "outputs": [],
   "source": [
    "fig, ax = plt.subplots(figsize=(10, 6))\n",
    "\n",
    "# Plot CO₂ concentration with markers and a bold line\n",
    "ax.plot(\n",
    "    filtered_df['Time'],\n",
    "    filtered_df['PPM_CO2'],\n",
    "    marker='o',\n",
    "    markersize=4,\n",
    "    linestyle='-',\n",
    "    linewidth=1.5,\n",
    "    color='green',\n",
    "    label='CO₂ Concentration'\n",
    ")\n",
    "\n",
    "# Title and labels in bold\n",
    "ax.set_title('CO₂ Concentration Over Time', fontsize=20, fontweight='bold')\n",
    "ax.set_xlabel('Time', fontsize=20, fontweight='bold')\n",
    "ax.set_ylabel('CO₂ Concentration (ppm)', fontsize=20, fontweight='bold')\n",
    "\n",
    "# Thicken all spines (axis borders)\n",
    "for spine in ax.spines.values():\n",
    "    spine.set_linewidth(1.5)\n",
    "\n",
    "# Tick parameters for major & minor ticks\n",
    "ax.tick_params(axis='both', which='major', labelsize=18, width=1.5, length=6)\n",
    "ax.tick_params(axis='both', which='minor', width=1.0, length=4)\n",
    "\n",
    "# X‐axis formatting: show HH:MM every 2 hours, minor ticks every 30 min\n",
    "ax.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))\n",
    "ax.xaxis.set_major_locator(mdates.HourLocator(interval=2))\n",
    "ax.xaxis.set_minor_locator(mdates.MinuteLocator(interval=30))\n",
    "\n",
    "# Rotate x‐tick labels for readability\n",
    "plt.xticks(rotation=45)\n",
    "\n",
    "# Grid styling: dashed major, dotted minor\n",
    "ax.grid(True, which='major', linestyle='--', linewidth=0.8, alpha=0.7)\n",
    "ax.grid(True, which='minor', linestyle=':', linewidth=0.5, alpha=0.5)\n",
    "\n",
    "# (Optional) If you want a legend, uncomment:\n",
    "# legend = ax.legend(loc='upper right', fontsize=12, frameon=True)\n",
    "# legend.get_frame().set_linewidth(1.5)\n",
    "\n",
    "# Tight layout\n",
    "plt.tight_layout()\n",
    "\n",
    "# Save figure\n",
    "co2_output = os.path.join(folder_path, 'co2_concentration_time_series.png')\n",
    "plt.savefig(co2_output, dpi=300,bbox_inches='tight')\n",
    "print(f\"Plot saved to: {co2_output}\")\n",
    "\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c6b7dcb0",
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "# Plot CO2 flux from both datasets\n",
    "plt.figure(figsize=(12, 6))\n",
    "\n",
    "# Plot merged_data_10min with circles ('o')\n",
    "plt.scatter(merged_data_10min['TIMESTAMP'], merged_data_10min['F_CO2'], s=50, alpha=0.7, label='Warmenhuizen (10 min)', marker='o')\n",
    "\n",
    "# Plot vl with crosses ('x')\n",
    "plt.scatter(vl['Timestamp'], vl['F_CO2_ppm_ms'], s=50, alpha=0.7, label='Veenkampen (30 min)', marker='x')\n",
    "\n",
    "plt.scatter(loo['Timestamp'], loo['F_CO2_ppm_ms'], s=50, alpha=0.7, label='Loobos (30min)',marker='+')\n",
    "\n",
    "plt.scatter(ams['Timestamp'], ams['F_CO2_ppm_ms'], s=50, alpha=0.7,label='Amsterdam (30min)', marker='d')\n",
    "\n",
    "plt.title('CO2 Flux Comparison')\n",
    "plt.xlabel('Time')\n",
    "plt.ylabel('CO2 Flux (ppm*ms^1)')\n",
    "plt.legend()\n",
    "plt.grid(True)\n",
    "plt.xticks(rotation=45)\n",
    "plt.tight_layout()\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "60d9b325",
   "metadata": {},
   "source": [
    "### LW_dn vs WVP (IVP)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "465388bd",
   "metadata": {
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "print(merged_data_10min.columns)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2afc2f72",
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "# Plotting IR20Dn vs wvp\n",
    "plt.figure(figsize=(10, 6))\n",
    "plt.scatter(merged_data_10min['IWV'], merged_data_10min['IR20Dn'], alpha=0.5)\n",
    "plt.xlabel('Integrated Water Vapor (kg/m^2)')\n",
    "plt.ylabel('LW_dn radiation (W/m^2)')\n",
    "plt.title('LW_dn vs Integrated Water Vapor')\n",
    "plt.grid(True)\n",
    "# Save the figure in the data_dir directory\n",
    "save_path = os.path.join(data_dir, 'LW_dn_vs_IWV.png')\n",
    "plt.savefig(save_path)\n",
    "\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5623db85",
   "metadata": {},
   "source": [
    "### SW_dn vs LWP"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2625fdf0",
   "metadata": {
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "# Plotting SR15D1Dn_Irr vs LWP\n",
    "plt.figure(figsize=(10, 6))\n",
    "plt.scatter(merged_data_10min['LWP'], merged_data_10min['SR15D1Dn_Irr'], alpha=0.5)\n",
    "plt.xlabel('Liquid Water Path (g/m^2)')\n",
    "plt.ylabel('SW_dn Irradiance (W/m^2)')\n",
    "plt.title('SW_dn vs Liquid Water Path')\n",
    "plt.grid(True)\n",
    "\n",
    "# Save the figure in the data_dir directory\n",
    "save_path = os.path.join(data_dir, 'SW_dn_vs_LWP.png')\n",
    "plt.savefig(save_path)\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9e7287be",
   "metadata": {},
   "source": [
    "### LWP vs time"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "0219473c",
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "# Plotting LWP vs Time\n",
    "plt.figure(figsize=(12, 6))\n",
    "plt.plot(merged_data_10min['TIMESTAMP'], merged_data_10min['LWP'], marker='o', linestyle='-')\n",
    "plt.xlabel('Time')\n",
    "plt.ylabel('Liquid Water Path (g/m^2)')\n",
    "plt.title('Liquid Water Path vs Time')\n",
    "plt.grid(True)\n",
    "plt.xticks(rotation=45)\n",
    "plt.tight_layout()\n",
    "\n",
    "# Save the figure in the data_dir directory\n",
    "save_path = os.path.join(data_dir, 'LWP_vs_Time_10min.png')\n",
    "plt.savefig(save_path)\n",
    "plt.show()  # Close the plot to release memory\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0b55f5ca",
   "metadata": {},
   "source": [
    "### T at 2.99m, T_sonic"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7cf08f50",
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "# Plotting Temperature_K_2.99, T_srf, and Average_Temperature_Corr vs Time\n",
    "plt.figure(figsize=(12, 6))\n",
    "plt.plot(merged_data_10min['TIMESTAMP'], merged_data_10min['Temperature_K_2.99'], marker='o', linestyle='-', color='b', label='Temperature_K_2.99')\n",
    "#plt.plot(merged_data_10min['TIMESTAMP'], merged_data_10min['T_srf'], marker='o', linestyle='-', color='g', label='T_srf')\n",
    "plt.plot(merged_data_10min['TIMESTAMP'], merged_data_10min['Average_Temperature_Corr'], marker='o', linestyle='-', color='r', label='Average_Temperature_Corr')\n",
    "plt.xlabel('Time')\n",
    "plt.ylabel('Temperature (K)')\n",
    "plt.title('Temperature Variations Over Time')\n",
    "plt.legend()\n",
    "plt.grid(True)\n",
    "plt.xticks(rotation=45)\n",
    "plt.tight_layout()\n",
    "\n",
    "save_path = os.path.join(data_dir, 'Temperature_Variations_10min.png')\n",
    "plt.savefig(save_path)\n",
    "plt.show()  # Close the plot to release memory"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c1f9f99c",
   "metadata": {},
   "source": [
    "### Wind at 2.99 m, Wind speed from sonic"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f65162bb",
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "# Plotting WS_ms_D15014_Avg and Wind_Speed vs Time\n",
    "plt.figure(figsize=(12, 6))\n",
    "plt.plot(merged_data_10min['TIMESTAMP'], merged_data_10min['WS_ms_D15008_Avg'], marker='o', linestyle='-', color='b', label='WS_ms_D15014_Avg')\n",
    "plt.plot(merged_data_10min['TIMESTAMP'], merged_data_10min['Wind_Speed'], marker='x', linestyle='-', color='r', label='Wind_Speed_sonic')\n",
    "plt.xlabel('Time')\n",
    "plt.ylabel('Wind Speed (m/s)')\n",
    "plt.title('Wind Speed vs Time')\n",
    "plt.grid(True)\n",
    "plt.legend()\n",
    "plt.xticks(rotation=45)\n",
    "plt.tight_layout()\n",
    "\n",
    "# Save the figure in the data_dir directory\n",
    "save_path = os.path.join(data_dir, 'Wind_Speed_Comparison_10min.png')\n",
    "plt.savefig(save_path)\n",
    "plt.show()  # Close the plot to release memory\n",
    "\n",
    "print(f\"Plot saved successfully as '{save_path}'\")\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "037cc293",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Calculate the temperature difference\n",
    "merged_data_10min['Windspeed_Diff'] = merged_data_10min['WS_ms_D15014_Avg'] - merged_data_10min['Wind_Speed']\n",
    "\n",
    "# Create a flag column\n",
    "merged_data_10min['Flag_windspeed'] = merged_data_10min['Windspeed_Diff'].apply(lambda x: 1 if abs(x) > 1 else 0)\n",
    "#Create the second flag column for rain condition\n",
    "merged_data_10min['Flag_Rain_windspeed'] = merged_data_10min['Rain'].apply(lambda x: 1 if x > 0 else 0)\n",
    "\n",
    "# Display the first few rows of the DataFrame to verify\n",
    "# Display the first few rows of the DataFrame to verify\n",
    "print(merged_data_10min[['TIMESTAMP', 'WS_ms_D15014_Avg', 'Wind_Speed', 'Windspeed_Diff', 'Rain', 'Flag_windspeed', 'Flag_Rain_windspeed']].head())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "69fd8540",
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "# Create the figure\n",
    "plt.figure(figsize=(10, 6))\n",
    "\n",
    "# Plot all wind speed points\n",
    "plt.scatter(merged_data_10min['TIMESTAMP'], merged_data_10min['WS_ms_D15014_Avg'],\n",
    "            label='WS_4.47m (mast)', alpha=0.4, color='royalblue')\n",
    "plt.scatter(merged_data_10min['TIMESTAMP'], merged_data_10min['Wind_Speed'],\n",
    "            label='Wind_Speed (sonic)', alpha=0.4, color='seagreen')\n",
    "\n",
    "# Highlight flagged outliers\n",
    "flagged_data = merged_data_10min[merged_data_10min['Flag_windspeed'] == 1]\n",
    "plt.scatter(flagged_data['TIMESTAMP'], flagged_data['WS_ms_D15014_Avg'],\n",
    "            color='red', label='Flagged Outliers ($\\Delta w_s > 1$ m/s)', s=50)\n",
    "plt.scatter(flagged_data['TIMESTAMP'], flagged_data['Wind_Speed'],\n",
    "            color='red', s=50)# label='Flagged Outliers (Wind_Speed)',\n",
    "\n",
    "# Highlight rain-flagged points\n",
    "flagged_rain_data = merged_data_10min[merged_data_10min['Flag_Rain_windspeed'] == 1]\n",
    "plt.scatter(flagged_rain_data['TIMESTAMP'], flagged_rain_data['WS_ms_D15014_Avg'],\n",
    "            color='black', marker='+', label='Flagged Rain', s=100)# (WS_ms_D15014_Avg)\n",
    "plt.scatter(flagged_rain_data['TIMESTAMP'], flagged_rain_data['Wind_Speed'],\n",
    "            color='black', marker='+', s=100)# label='Flagged Rain (Wind_Speed)',\n",
    "\n",
    "# Format x-axis\n",
    "plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))\n",
    "plt.xticks(fontsize=18, rotation=45)\n",
    "plt.yticks(fontsize=18)\n",
    "\n",
    "# Axis labels and title\n",
    "plt.xlabel('Time', fontsize=20)\n",
    "plt.ylabel('Wind Speed (m/s)', fontsize=20)\n",
    "plt.title('Wind Speed Data with Flagged Outliers', fontsize=20, weight='bold')\n",
    "\n",
    "# Legend, grid, and layout\n",
    "plt.legend(fontsize=14, loc='upper left')\n",
    "plt.grid(True, linestyle='--', alpha=0.7)\n",
    "plt.tight_layout()\n",
    "\n",
    "# Save the plot\n",
    "output_path = os.path.join(data_dir, 'wind_speed_scatter_flagged.png')\n",
    "plt.savefig(output_path, dpi=300)\n",
    "\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "31f474c5",
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "print(merged_data_10min.columns)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f5581bd2",
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "\n",
    "\n",
    "# 1. Filter the DataFrame for rows where all the flags are 0\n",
    "filtered_data = merged_data_10min[\n",
    "    (merged_data_10min['Flag'] == 0) &\n",
    "     (merged_data_10min['Flag_Rain'] == 0) #& \n",
    " #  (merged_data_10min['Flag_windspeed'] == 0) &\n",
    "  #  (merged_data_10min['Flag_Rain_windspeed'] == 0)\n",
    "]\n",
    "\n",
    "# 2. Extract the necessary columns for plotting\n",
    "time = pd.to_datetime(filtered_data['TIMESTAMP'])  # Convert timestamp to datetime\n",
    "SHF = filtered_data['SHF']\n",
    "LHF = filtered_data['LHF']\n",
    "\n",
    "Net_Rad = filtered_data['Net_Radiation_10min']\n",
    "\n",
    "G = filtered_data['G']\n",
    "F_CO2 = filtered_data['F_CO2']\n",
    "\n",
    "# 3. Create the scatter plots\n",
    "fig, axs = plt.subplots(2, 2, figsize=(15, 10))\n",
    "\n",
    "# Scatter plot for SHF\n",
    "axs[0, 0].scatter(time, SHF, color='red', alpha=0.5,label='SHF')\n",
    "axs[0, 0].scatter(time, LHF, color='blue', alpha=0.5,label='LHF')\n",
    "\n",
    "axs[0, 0].set_title('Sensible and Latent Heat Flux')\n",
    "axs[0, 0].set_xlabel('Time')\n",
    "axs[0, 0].set_ylabel('Heat FLux (W/m²)')\n",
    "axs[0, 0].grid(True)\n",
    "axs[0, 0].legend()  # Add legend for SHF and LHF\n",
    "\n",
    "# Scatter plot for LHF\n",
    "axs[0, 1].scatter(time, Net_Rad, color='blue', alpha=0.5)\n",
    "axs[0, 1].set_title('Net Radiation')\n",
    "axs[0, 1].set_xlabel('Time')\n",
    "axs[0, 1].set_ylabel('Net_Rad (W/m²)')\n",
    "axs[0, 1].grid(True)\n",
    "\n",
    "# Scatter plot for G\n",
    "axs[1, 0].scatter(time, G, color='green', alpha=0.5)\n",
    "axs[1, 0].set_title('Ground Heat Flux (G)')\n",
    "axs[1, 0].set_xlabel('Time')\n",
    "axs[1, 0].set_ylabel('G (W/m²)')\n",
    "axs[1, 0].grid(True)\n",
    "\n",
    "# Scatter plot for F_CO2\n",
    "axs[1, 1].scatter(time, F_CO2, color='purple', alpha=0.5)\n",
    "axs[1, 1].set_title('CO2 Flux (F_CO2)')\n",
    "axs[1, 1].set_xlabel('Time')\n",
    "axs[1, 1].set_ylabel('F_CO2 (μmol m⁻² s⁻¹)')\n",
    "axs[1, 1].grid(True)\n",
    "\n",
    "# 4. Adjust layout for better spacing\n",
    "plt.tight_layout()\n",
    "plt.suptitle('Scatter Plots of SEB components Over Time (Filtered by Flags)', fontsize=16, y=1.02)\n",
    "\n",
    "# Create the full save path\n",
    "save_path = os.path.join(data_dir, 'SEB_scatter_plots_10min.png')  # The 'dpi' argument should not be in this line\n",
    "\n",
    "# Save the figure using plt.savefig\n",
    "plt.savefig(save_path, dpi=300,bbox_inches='tight')  # Add the 'dpi=300' here, inside plt.savefig()\n",
    "\n",
    "# Show the scatter plots\n",
    "plt.show()\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3a80c021",
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "# Define the full path to the CSV file\n",
    "fluxes_30min_file_path = os.path.join(data_dir, 'flux_data_30min.csv')\n",
    "\n",
    "# Load the CSV file into a DataFrame\n",
    "flux_data_30min = pd.read_csv(fluxes_30min_file_path)\n",
    "\n",
    "# Convert the 'TIMESTAMP' column to datetime format if it's not already\n",
    "flux_data_30min['TIMESTAMP'] = pd.to_datetime(flux_data_30min['TIMESTAMP'])\n",
    "\n",
    "\n",
    "# Print the first few rows to confirm the data is loaded correctly\n",
    "print(flux_data_30min.head())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d93e9e2f",
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "# 2) Now define each path by joining that base folder with the filename:\n",
    "loobos_path                    = os.path.join(data_vl, 'Loobos_Fluxes.csv')\n",
    "amsterdam_path                 = os.path.join(data_vl, 'Amsterdam_Fluxes.csv')\n",
    "veenkampen_path                = os.path.join(data_vl, 'Veenkampen_Fluxes.csv')\n",
    "\n",
    "loobos_soil_path               = os.path.join(data_vl, 'Loobos_Soil.csv')\n",
    "veenkampen_soil_path           = os.path.join(data_vl, 'Veenkampen_Soil.csv')\n",
    "\n",
    "veenkampen_net_radiation_path  = os.path.join(data_vl, 'Veenkampen_Meteorology.csv')#Veenkampen_net_rad.csv #Veenkampen_Meteorology\n",
    "loobos_net_radiation_path      = os.path.join(data_vl, 'Loobos_Meteorology.csv')\n",
    "amsterdam_net_radiation_path   = os.path.join(data_vl, 'Amsterdam_Meteorology.csv')\n",
    "\n",
    "# Read the CSV files\n",
    "loobos_data = pd.read_csv(loobos_path)\n",
    "amsterdam_data = pd.read_csv(amsterdam_path)\n",
    "veenkampen_data = pd.read_csv(veenkampen_path)\n",
    "\n",
    "# Read the CSV files for soil data\n",
    "loobos_soil_data = pd.read_csv(loobos_soil_path)\n",
    "veenkampen_soil_data = pd.read_csv(veenkampen_soil_path)\n",
    "\n",
    "# Read the CSV files for net rad data\n",
    "veenkampen_net_rad = pd.read_csv(veenkampen_net_radiation_path)\n",
    "loobos_net_rad = pd.read_csv(loobos_net_radiation_path)\n",
    "amsterdam_net_rad = pd.read_csv(amsterdam_net_radiation_path)\n",
    "\n",
    "# Deleting the second row from each DataFrame\n",
    "loobos_data = loobos_data.drop(index=0).reset_index(drop=True)\n",
    "amsterdam_data = amsterdam_data.drop(index=0).reset_index(drop=True)\n",
    "veenkampen_data = veenkampen_data.drop(index=0).reset_index(drop=True)\n",
    "loobos_soil_data=loobos_soil_data.drop(index=0).reset_index(drop=True)\n",
    "veenkampen_soil_data=veenkampen_soil_data.drop(index=0).reset_index(drop=True)\n",
    "veenkampen_net_rad=veenkampen_net_rad.drop(index=0).reset_index(drop=True)\n",
    "loobos_net_rad=loobos_net_rad.drop(index=0).reset_index(drop=True)\n",
    "amsterdam_net_rad=amsterdam_net_rad.drop(index=0).reset_index(drop=True)\n",
    "\n",
    "# Keep only the specified columns\n",
    "columns_to_keep = ['Timestamp','H', 'LE','co2_flux']\n",
    "columns_to_keep_rad = ['Timestamp','SW_IN_1_1_1','SW_OUT_1_1_1','LW_IN_1_1_1','LW_OUT_1_1_1']\n",
    "\n",
    "columns_to_keep_soil = ['Timestamp','G_1_1_1','G_2_1_1','G_3_1_1','G_4_1_1']\n",
    "columns_to_keep_rad_loobos = ['Timestamp','SW_IN_1_1_1','SW_OUT_1_1_1','LW_IN_1_1_1','LW_OUT_1_1_1']\n",
    "columns_to_keep_rad_ams=['Timestamp','Rnet']\n",
    "\n",
    "loobos_data = loobos_data[columns_to_keep]\n",
    "amsterdam_data = amsterdam_data[columns_to_keep]\n",
    "veenkampen_data = veenkampen_data[columns_to_keep]\n",
    "loobos_soil_data=loobos_soil_data[columns_to_keep_soil]\n",
    "veenkampen_soil_data=veenkampen_soil_data[columns_to_keep_soil]\n",
    "veenkampen_net_rad=veenkampen_net_rad[columns_to_keep_rad]\n",
    "loobos_net_rad=loobos_net_rad[columns_to_keep_rad_loobos]\n",
    "amsterdam_net_rad=amsterdam_net_rad[columns_to_keep_rad_ams]\n",
    "\n",
    "# Convert 'H' and 'LE' columns to numeric\n",
    "loobos_data['H'] = pd.to_numeric(loobos_data['H'], errors='coerce')\n",
    "loobos_data['LE'] = pd.to_numeric(loobos_data['LE'], errors='coerce')\n",
    "loobos_data['co2_flux'] = pd.to_numeric(loobos_data['co2_flux'], errors='coerce')\n",
    "# Convert µmol m-2 s-1 to ppm m s^-1\n",
    "# Add the converted values back to the DataFrame\n",
    "loobos_data['F_CO2_ppm_ms'] = ((loobos_data['co2_flux'] / 1e6 )/ 22.414) * 1e6  \n",
    "\n",
    "amsterdam_data['H'] = pd.to_numeric(amsterdam_data['H'], errors='coerce')\n",
    "amsterdam_data['LE'] = pd.to_numeric(amsterdam_data['LE'], errors='coerce')\n",
    "amsterdam_data['co2_flux'] = pd.to_numeric(amsterdam_data['co2_flux'], errors='coerce')\n",
    "amsterdam_data['F_CO2_ppm_ms'] = ((amsterdam_data['co2_flux'] / 1e6 )/ 22.414) * 1e6  \n",
    "\n",
    "veenkampen_data['H'] = pd.to_numeric(veenkampen_data['H'], errors='coerce')\n",
    "veenkampen_data['LE'] = pd.to_numeric(veenkampen_data['LE'], errors='coerce')\n",
    "veenkampen_data['co2_flux'] = pd.to_numeric(veenkampen_data['co2_flux'], errors='coerce')\n",
    "veenkampen_data['F_CO2_ppm_ms'] = ((veenkampen_data['co2_flux'] / 1e6 )/ 22.414) * 1e6  \n",
    "\n",
    "for col in columns_to_keep_soil[1:]:  # Convert to numeric excluding 'Timestamp'\n",
    "    loobos_soil_data[col] = pd.to_numeric(loobos_soil_data[col], errors='coerce')\n",
    "    \n",
    "for col in columns_to_keep_soil[1:]:  # Convert to numeric excluding 'Timestamp'\n",
    "    veenkampen_soil_data[col] = pd.to_numeric(veenkampen_soil_data[col], errors='coerce')\n",
    "\n",
    "#veenkampen_net_rad['RN_1_1_1']=pd.to_numeric(veenkampen_net_rad['RN_1_1_1'],errors='coerce')\n",
    "for col in columns_to_keep_rad[1:]:  # Convert to numeric excluding 'Timestamp'\n",
    "    veenkampen_net_rad[col] = pd.to_numeric(veenkampen_net_rad[col], errors='coerce')\n",
    "    \n",
    "amsterdam_net_rad['Rnet']=pd.to_numeric(amsterdam_net_rad['Rnet'],errors='coerce')\n",
    "\n",
    "for col in columns_to_keep_rad_loobos[1:]:  # Convert to numeric excluding 'Timestamp'\n",
    "    loobos_net_rad[col] = pd.to_numeric(loobos_net_rad[col], errors='coerce')\n",
    "\n",
    "# Convert the 'Timestamp' column to datetime format (if it's not already)\n",
    "loobos_data['Timestamp'] = pd.to_datetime(loobos_data['Timestamp'])\n",
    "amsterdam_data['Timestamp'] = pd.to_datetime(amsterdam_data['Timestamp'])\n",
    "veenkampen_data['Timestamp'] = pd.to_datetime(veenkampen_data['Timestamp'])\n",
    "loobos_soil_data['Timestamp'] = pd.to_datetime(loobos_soil_data['Timestamp'])\n",
    "veenkampen_soil_data['Timestamp'] = pd.to_datetime(veenkampen_soil_data['Timestamp'])\n",
    "veenkampen_net_rad['Timestamp'] = pd.to_datetime(veenkampen_net_rad['Timestamp'])\n",
    "loobos_net_rad['Timestamp'] = pd.to_datetime(loobos_net_rad['Timestamp'])\n",
    "amsterdam_net_rad['Timestamp'] = pd.to_datetime(amsterdam_net_rad['Timestamp'])\n",
    "\n",
    "# Filter for rows with the specified date\n",
    "date_filter = date_str\n",
    "\n",
    "loobos_data = loobos_data[loobos_data['Timestamp'].dt.date == pd.to_datetime(date_filter).date()]\n",
    "amsterdam_data = amsterdam_data[amsterdam_data['Timestamp'].dt.date == pd.to_datetime(date_filter).date()]\n",
    "veenkampen_data = veenkampen_data[veenkampen_data['Timestamp'].dt.date == pd.to_datetime(date_filter).date()]\n",
    "loobos_soil_data = loobos_soil_data[loobos_soil_data['Timestamp'].dt.date == pd.to_datetime(date_filter).date()]\n",
    "veenkampen_soil_data = veenkampen_soil_data[veenkampen_soil_data['Timestamp'].dt.date == pd.to_datetime(date_filter).date()]\n",
    "veenkampen_net_rad = veenkampen_net_rad[veenkampen_net_rad['Timestamp'].dt.date == pd.to_datetime(date_filter).date()]\n",
    "loobos_net_rad = loobos_net_rad[loobos_net_rad['Timestamp'].dt.date == pd.to_datetime(date_filter).date()]\n",
    "amsterdam_net_rad=amsterdam_net_rad[amsterdam_net_rad['Timestamp'].dt.date == pd.to_datetime(date_filter).date()]\n",
    "\n",
    "# Drop NaN values\n",
    "loobos_data = loobos_data.dropna()\n",
    "amsterdam_data = amsterdam_data.dropna()\n",
    "veenkampen_data = veenkampen_data.dropna()\n",
    "loobos_soil_data=loobos_soil_data.dropna()\n",
    "veenkampen_soil_data=veenkampen_soil_data.dropna()\n",
    "veenkampen_net_rad=veenkampen_net_rad.dropna()\n",
    "loobos_net_rad=loobos_net_rad.dropna()\n",
    "amsterdam_net_rad=amsterdam_net_rad.dropna()\n",
    "\n",
    "# Reset index\n",
    "loobos_data = loobos_data.reset_index(drop=True)\n",
    "amsterdam_data = amsterdam_data.reset_index(drop=True)\n",
    "veenkampen_data = veenkampen_data.reset_index(drop=True)\n",
    "loobos_soil_data=loobos_soil_data.reset_index(drop=True)\n",
    "veenkampen_soil_data=veenkampen_soil_data.reset_index(drop=True)\n",
    "veenkampen_net_rad=veenkampen_net_rad.reset_index(drop=True)\n",
    "loobos_net_rad=loobos_net_rad.reset_index(drop=True)\n",
    "amsterdam_net_rad=amsterdam_net_rad.reset_index(drop=True)\n",
    "veenkampen_net_rad['Net_Rad']=veenkampen_net_rad['SW_OUT_1_1_1']-veenkampen_net_rad['SW_IN_1_1_1']+veenkampen_net_rad['LW_OUT_1_1_1']-veenkampen_net_rad['LW_IN_1_1_1']\n",
    "loobos_net_rad['Net_Rad']=loobos_net_rad['SW_OUT_1_1_1']-loobos_net_rad['SW_IN_1_1_1']+loobos_net_rad['LW_OUT_1_1_1']-loobos_net_rad['LW_IN_1_1_1']\n",
    "# Display the first few rows of each DataFrame\n",
    "print(\"Loobos Fluxes Data:\")\n",
    "print(loobos_data)\n",
    "\n",
    "print(\"\\nAmsterdam Fluxes Data:\")\n",
    "print(amsterdam_data)\n",
    "\n",
    "print(\"\\nVeenkampen Fluxes Data:\")\n",
    "print(veenkampen_data)\n",
    "\n",
    "#Round the 'Timestamp' to the nearest 10-minute interval for Loobos\n",
    "loobos_soil_data['Timestamp'] = loobos_soil_data['Timestamp'].dt.floor('10T')\n",
    "loobos_soil_data = loobos_soil_data.groupby('Timestamp').mean().reset_index()\n",
    "\n",
    "# Round the 'Timestamp' to the nearest 10-minute interval for Veenkampen\n",
    "veenkampen_soil_data['Timestamp'] = veenkampen_soil_data['Timestamp'].dt.floor('10T')\n",
    "veenkampen_soil_data = veenkampen_soil_data.groupby('Timestamp').mean().reset_index()\n",
    "\n",
    "# Round the 'Timestamp' to the nearest 10-minute interval for Veenkampen\n",
    "veenkampen_net_rad['Timestamp'] = veenkampen_net_rad['Timestamp'].dt.floor('10T')\n",
    "veenkampen_net_rad = veenkampen_net_rad.groupby('Timestamp').mean().reset_index()\n",
    "\n",
    "# Round the 'Timestamp' to the nearest 10-minute interval for Veenkampen\n",
    "loobos_net_rad['Timestamp'] = loobos_net_rad['Timestamp'].dt.floor('10T')\n",
    "loobos_net_rad = loobos_net_rad.groupby('Timestamp').mean().reset_index()\n",
    "\n",
    "\n",
    "# Round the 'Timestamp' to the nearest 10-minute interval for Veenkampen\n",
    "amsterdam_net_rad['Timestamp'] = amsterdam_net_rad['Timestamp'].dt.floor('10T')\n",
    "amsterdam_net_rad = amsterdam_net_rad.groupby('Timestamp').mean().reset_index()\n",
    "\n",
    "print(\"\\nLoobos Soil Data:\")\n",
    "print(loobos_soil_data)\n",
    "\n",
    "print(\"\\nVeenkampen Soil Data:\")\n",
    "print(veenkampen_soil_data)\n",
    "\n",
    "print(\"\\nVeenkampen Net Rad Data:\")\n",
    "print(veenkampen_net_rad)\n",
    "\n",
    "print(\"\\nLoobos Net Rad Data:\")\n",
    "print(loobos_net_rad)\n",
    "\n",
    "print(\"\\nAmsterdam Net Rad Data:\")\n",
    "print(amsterdam_net_rad)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b0504688",
   "metadata": {},
   "outputs": [],
   "source": [
    "# 1) SHF vs Time for Multiple Stations + Measured SHF\n",
    "# ------------------------------------------------------------\n",
    "fig, ax = plt.subplots(figsize=(10, 6))\n",
    "\n",
    "# Scatter for Loobos, Amsterdam, Veenkampen\n",
    "ax.scatter(\n",
    "    loobos_data['Timestamp'],\n",
    "    loobos_data['H'],\n",
    "    label='Loobos H',\n",
    "    color='green',\n",
    "    s=40,\n",
    "    alpha=0.7,\n",
    "    marker='o'\n",
    ")\n",
    "ax.scatter(\n",
    "    amsterdam_data['Timestamp'],\n",
    "    amsterdam_data['H'],\n",
    "    label='Amsterdam H',\n",
    "    color='orange',\n",
    "    s=40,\n",
    "    alpha=0.7,\n",
    "    marker='o'\n",
    ")\n",
    "ax.scatter(\n",
    "    veenkampen_data['Timestamp'],\n",
    "    veenkampen_data['H'],\n",
    "    label='Veenkampen H',\n",
    "    color='blue',\n",
    "    s=40,\n",
    "    alpha=0.7,\n",
    "    marker='o'\n",
    ")\n",
    "\n",
    "# Scatter for Measured SHF\n",
    "ax.scatter(\n",
    "    time,\n",
    "    SHF,\n",
    "    label='SHF (Measured)',\n",
    "    color='red',\n",
    "    s=40,\n",
    "    alpha=0.7,\n",
    "    marker='x'\n",
    ")\n",
    "\n",
    "# Title and labels in bold\n",
    "ax.set_title('SHF vs. Time', fontsize=20, fontweight='bold')\n",
    "ax.set_xlabel('Time', fontsize=20, fontweight='bold')\n",
    "ax.set_ylabel('SHF [W/m²]', fontsize=20, fontweight='bold')\n",
    "\n",
    "# Thicken all spines\n",
    "for spine in ax.spines.values():\n",
    "    spine.set_linewidth(1.5)\n",
    "\n",
    "# Tick parameters\n",
    "ax.tick_params(axis='both', which='major', labelsize=18, width=1.5, length=6)\n",
    "ax.tick_params(axis='both', which='minor', width=1.0, length=4)\n",
    "\n",
    "# X‐axis formatting: HH:MM every 2 hours, minor ticks every 30 min\n",
    "ax.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))\n",
    "ax.xaxis.set_major_locator(mdates.HourLocator(interval=2))\n",
    "ax.xaxis.set_minor_locator(mdates.MinuteLocator(interval=30))\n",
    "\n",
    "# Rotate x‐tick labels\n",
    "plt.xticks(rotation=45)\n",
    "\n",
    "# Grid styling\n",
    "ax.grid(True, which='major', linestyle='--', linewidth=0.8, alpha=0.7)\n",
    "ax.grid(True, which='minor', linestyle=':', linewidth=0.5, alpha=0.5)\n",
    "\n",
    "# Legend styling\n",
    "legend = ax.legend(loc='upper left', fontsize=14, frameon=True)\n",
    "legend.get_frame().set_linewidth(1.5)\n",
    "\n",
    "# Tight layout and save\n",
    "plt.tight_layout()\n",
    "shf_stations_path = os.path.join(data_dir, 'SHF_vs_Stations.png')\n",
    "plt.savefig(shf_stations_path, dpi=300,bbox_inches='tight')\n",
    "print(f\"Plot saved to: {shf_stations_path}\")\n",
    "plt.show()\n",
    "\n",
    "\n",
    "# ------------------------------------------------------------\n",
    "# 2) LHF vs Time for Multiple Stations + Measured LHF\n",
    "# ------------------------------------------------------------\n",
    "fig, ax = plt.subplots(figsize=(10, 6))\n",
    "\n",
    "# Scatter for Loobos, Amsterdam, Veenkampen\n",
    "ax.scatter(\n",
    "    loobos_data['Timestamp'],\n",
    "    loobos_data['LE'],\n",
    "    label='Loobos LE',\n",
    "    color='green',\n",
    "    s=40,\n",
    "    alpha=0.7,\n",
    "    marker='o'\n",
    ")\n",
    "ax.scatter(\n",
    "    amsterdam_data['Timestamp'],\n",
    "    amsterdam_data['LE'],\n",
    "    label='Amsterdam LE',\n",
    "    color='orange',\n",
    "    s=40,\n",
    "    alpha=0.7,\n",
    "    marker='o'\n",
    ")\n",
    "ax.scatter(\n",
    "    veenkampen_data['Timestamp'],\n",
    "    veenkampen_data['LE'],\n",
    "    label='Veenkampen LE',\n",
    "    color='blue',\n",
    "    s=40,\n",
    "    alpha=0.7,\n",
    "    marker='o'\n",
    ")\n",
    "\n",
    "# Scatter for Measured LHF\n",
    "ax.scatter(\n",
    "    time,\n",
    "    LHF,\n",
    "    label='LHF (Measured)',\n",
    "    color='red',\n",
    "    s=40,\n",
    "    alpha=0.7,\n",
    "    marker='x'\n",
    ")\n",
    "\n",
    "# Title and labels in bold\n",
    "ax.set_title('LHF vs. Time', fontsize=20, fontweight='bold')\n",
    "ax.set_xlabel('Time', fontsize=20, fontweight='bold')\n",
    "ax.set_ylabel('LHF [W/m²]', fontsize=20, fontweight='bold')\n",
    "\n",
    "# Thicken all spines\n",
    "for spine in ax.spines.values():\n",
    "    spine.set_linewidth(1.5)\n",
    "\n",
    "# Tick parameters\n",
    "ax.tick_params(axis='both', which='major', labelsize=18, width=1.5, length=6)\n",
    "ax.tick_params(axis='both', which='minor', width=1.0, length=4)\n",
    "\n",
    "# X‐axis formatting: HH:MM every 2 hours, minor ticks every 30 min\n",
    "ax.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))\n",
    "ax.xaxis.set_major_locator(mdates.HourLocator(interval=2))\n",
    "ax.xaxis.set_minor_locator(mdates.MinuteLocator(interval=30))\n",
    "\n",
    "# Rotate x‐tick labels\n",
    "plt.xticks(rotation=45)\n",
    "\n",
    "# Grid styling\n",
    "ax.grid(True, which='major', linestyle='--', linewidth=0.8, alpha=0.7)\n",
    "ax.grid(True, which='minor', linestyle=':', linewidth=0.5, alpha=0.5)\n",
    "\n",
    "# Legend styling\n",
    "legend = ax.legend(loc='upper left', fontsize=14, frameon=True)\n",
    "legend.get_frame().set_linewidth(1.5)\n",
    "\n",
    "# Tight layout and save\n",
    "plt.tight_layout()\n",
    "lhf_stations_path = os.path.join(data_dir, 'LHF_vs_Stations.png')\n",
    "plt.savefig(lhf_stations_path, dpi=300,bbox_inches='tight')\n",
    "print(f\"Plot saved to: {lhf_stations_path}\")\n",
    "plt.show()\n",
    "\n",
    "\n",
    "# ------------------------------------------------------------\n",
    "# 3) CO₂ Flux vs Time for Multiple Stations + Measured F_CO2\n",
    "# ------------------------------------------------------------\n",
    "fig, ax = plt.subplots(figsize=(10, 6))\n",
    "\n",
    "# Scatter for Loobos, Amsterdam, Veenkampen\n",
    "ax.scatter(\n",
    "    loobos_data['Timestamp'],\n",
    "    loobos_data['F_CO2_ppm_ms'],\n",
    "    label='Loobos F_CO₂',\n",
    "    color='green',\n",
    "    s=40,\n",
    "    alpha=0.7,\n",
    "    marker='o'\n",
    ")\n",
    "ax.scatter(\n",
    "    amsterdam_data['Timestamp'],\n",
    "    amsterdam_data['F_CO2_ppm_ms'],\n",
    "    label='Amsterdam F_CO₂',\n",
    "    color='orange',\n",
    "    s=40,\n",
    "    alpha=0.7,\n",
    "    marker='o'\n",
    ")\n",
    "ax.scatter(\n",
    "    veenkampen_data['Timestamp'],\n",
    "    veenkampen_data['F_CO2_ppm_ms'],\n",
    "    label='Veenkampen F_CO₂',\n",
    "    color='blue',\n",
    "    s=40,\n",
    "    alpha=0.7,\n",
    "    marker='o'\n",
    ")\n",
    "\n",
    "# Scatter for Measured F_CO2\n",
    "ax.scatter(\n",
    "    time,\n",
    "    F_CO2,\n",
    "    label='F_CO₂ (Measured)',\n",
    "    color='red',\n",
    "    s=40,\n",
    "    alpha=0.7,\n",
    "    marker='x'\n",
    ")\n",
    "\n",
    "# Title and labels in bold\n",
    "ax.set_title('CO₂ Flux vs. Time', fontsize=20, fontweight='bold')\n",
    "ax.set_xlabel('Time', fontsize=20, fontweight='bold')\n",
    "ax.set_ylabel('CO₂ Flux [ppm · m s⁻¹]', fontsize=20, fontweight='bold')\n",
    "\n",
    "# Thicken all spines\n",
    "for spine in ax.spines.values():\n",
    "    spine.set_linewidth(1.5)\n",
    "\n",
    "# Tick parameters\n",
    "ax.tick_params(axis='both', which='major', labelsize=18, width=1.5, length=6)\n",
    "ax.tick_params(axis='both', which='minor', width=1.0, length=4)\n",
    "\n",
    "# X‐axis formatting: HH:MM every 2 hours, minor ticks every 30 min\n",
    "ax.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))\n",
    "ax.xaxis.set_major_locator(mdates.HourLocator(interval=2))\n",
    "ax.xaxis.set_minor_locator(mdates.MinuteLocator(interval=30))\n",
    "\n",
    "# Rotate x‐tick labels\n",
    "plt.xticks(rotation=45)\n",
    "plt.ylim(-1,1)\n",
    "# Grid styling\n",
    "ax.grid(True, which='major', linestyle='--', linewidth=0.8, alpha=0.7)\n",
    "ax.grid(True, which='minor', linestyle=':', linewidth=0.5, alpha=0.5)\n",
    "\n",
    "# Legend styling\n",
    "legend = ax.legend(loc='upper left', fontsize=14, frameon=True)\n",
    "legend.get_frame().set_linewidth(1.5)\n",
    "\n",
    "# Tight layout and save\n",
    "plt.tight_layout()\n",
    "co2_flux_path = os.path.join(data_dir, 'FCO2_vs_Stations.png')\n",
    "plt.savefig(co2_flux_path, dpi=300,bbox_inches='tight')\n",
    "print(f\"Plot saved to: {co2_flux_path}\")\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "dc52303c",
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "\n",
    "plt.figure(figsize=(14, 6))\n",
    "\n",
    "\n",
    "# Plotting Loobos Soil Fluxes with circular markers and green color\n",
    "plt.scatter(loobos_soil_data['Timestamp'], loobos_soil_data['G_1_1_1'], \n",
    "            label='Loobos G_1_1_1', color='green', alpha=0.5, s=30, marker='o')  # Circle\n",
    "plt.scatter(loobos_soil_data['Timestamp'], loobos_soil_data['G_2_1_1'], \n",
    "            label='Loobos G_2_1_1', color='green', alpha=0.5, s=30, marker='D')  # Circle\n",
    "plt.scatter(loobos_soil_data['Timestamp'], loobos_soil_data['G_3_1_1'], \n",
    "            label='Loobos G_3_1_1', color='green', alpha=0.5, s=30, marker='v')  # Circle\n",
    "plt.scatter(loobos_soil_data['Timestamp'], loobos_soil_data['G_4_1_1'], \n",
    "            label='Loobos G_4_1_1', color='green', alpha=0.5, s=30, marker='^')  # Circle\n",
    "\n",
    "# Plotting Veenkampen Soil Fluxes with circular markers and blue color\n",
    "plt.scatter(veenkampen_soil_data['Timestamp'], veenkampen_soil_data['G_1_1_1'], \n",
    "            label='Veenkampen G_1_1_1', color='blue', alpha=0.5, s=30, marker='o')  # Circle\n",
    "plt.scatter(veenkampen_soil_data['Timestamp'], veenkampen_soil_data['G_2_1_1'], \n",
    "            label='Veenkampen G_2_1_1', color='blue', alpha=0.5, s=30, marker='D')  # Circle\n",
    "plt.scatter(veenkampen_soil_data['Timestamp'], veenkampen_soil_data['G_3_1_1'], \n",
    "            label='Veenkampen G_3_1_1', color='blue', alpha=0.5, s=30, marker='v')  # Circle\n",
    "plt.scatter(veenkampen_soil_data['Timestamp'], veenkampen_soil_data['G_4_1_1'], \n",
    "            label='Veenkampen G_4_1_1', color='blue', alpha=0.5, s=30, marker='^')  # Circle\n",
    "\n",
    "# Example of measured G values (replace 'time' and 'G' with actual variables)\n",
    "plt.scatter(time, G, label='G (Measured)', color='red', s=30, alpha=0.5, marker='x')  # Cross\n",
    "\n",
    "# Customize the plot\n",
    "plt.title('Soil Fluxes vs. Time', fontsize=20)\n",
    "plt.xlabel('Timestamp', fontsize=16)\n",
    "plt.ylabel('Soil Flux [W/m^2]', fontsize=16)\n",
    "plt.xticks(rotation=45, fontsize=12)\n",
    "plt.yticks(fontsize=12)\n",
    "plt.grid(True, linestyle='--', alpha=0.5)  # Change grid style\n",
    "plt.legend(fontsize=10)\n",
    "plt.tight_layout()\n",
    "\n",
    "# Show plot\n",
    "plt.show()\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2cfa850b",
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "# Net Radiation vs. Time\n",
    "# ------------------------------------------------------------\n",
    "fig, ax = plt.subplots(figsize=(10, 6))\n",
    "\n",
    "# Scatter for Veenkampen (negative Net_Rad)\n",
    "ax.scatter(\n",
    "    veenkampen_net_rad['Timestamp'],\n",
    "    -veenkampen_net_rad['Net_Rad'],\n",
    "    label='Veenkampen Net Rad',\n",
    "    color='blue',\n",
    "    alpha=0.5,\n",
    "    s=40,\n",
    "    marker='o'\n",
    ")\n",
    "\n",
    "# Scatter for Loobos (negative Net_Rad)\n",
    "ax.scatter(\n",
    "    loobos_net_rad['Timestamp'],\n",
    "    -loobos_net_rad['Net_Rad'],\n",
    "    label='Loobos Net Rad',\n",
    "    color='green',\n",
    "    alpha=0.5,\n",
    "    s=40,\n",
    "    marker='o'\n",
    ")\n",
    "\n",
    "# Scatter for Amsterdam (Rnet, assume already signed correctly)\n",
    "ax.scatter(\n",
    "    amsterdam_net_rad['Timestamp'],\n",
    "    amsterdam_net_rad['Rnet'],\n",
    "    label='Amsterdam Net Rad',\n",
    "    color='orange',\n",
    "    alpha=0.5,\n",
    "    s=40,\n",
    "    marker='o'\n",
    ")\n",
    "\n",
    "# Scatter for measured Net_Rad (negative)\n",
    "ax.scatter(\n",
    "    time,\n",
    "    -Net_Rad,\n",
    "    label='Net_Rad (Measured)',\n",
    "    color='red',\n",
    "    s=40,\n",
    "    alpha=0.5,\n",
    "    marker='x'\n",
    ")\n",
    "\n",
    "# Title and labels in bold\n",
    "ax.set_title('Net Radiation vs. Time', fontsize=20, fontweight='bold')\n",
    "ax.set_xlabel('Time', fontsize=20, fontweight='bold')\n",
    "ax.set_ylabel('Net Radiation [W/m²]', fontsize=20, fontweight='bold')\n",
    "\n",
    "# Thicken all spines\n",
    "for spine in ax.spines.values():\n",
    "    spine.set_linewidth(1.5)\n",
    "\n",
    "# Tick parameters\n",
    "ax.tick_params(axis='both', which='major', labelsize=18, width=1.5, length=6)\n",
    "ax.tick_params(axis='both', which='minor', width=1.0, length=4)\n",
    "\n",
    "# X‐axis formatting: HH:MM every 2 hours, minor ticks every 30 min\n",
    "ax.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))\n",
    "ax.xaxis.set_major_locator(mdates.HourLocator(interval=2))\n",
    "ax.xaxis.set_minor_locator(mdates.MinuteLocator(interval=30))\n",
    "\n",
    "# Rotate x‐tick labels\n",
    "plt.xticks(rotation=45)\n",
    "\n",
    "# Y‐axis tick font size\n",
    "ax.tick_params(axis='y', labelsize=18)\n",
    "\n",
    "# Grid styling: dashed major, dotted minor\n",
    "ax.grid(True, which='major', linestyle='--', linewidth=0.8, alpha=0.7)\n",
    "ax.grid(True, which='minor', linestyle=':', linewidth=0.5, alpha=0.5)\n",
    "\n",
    "# Legend styling\n",
    "legend = ax.legend(loc='upper left', fontsize=14, frameon=True)\n",
    "legend.get_frame().set_linewidth(1.5)\n",
    "\n",
    "# Tight layout and save\n",
    "plt.tight_layout()\n",
    "output_path = os.path.join(data_dir, 'net_radiation_vs_time.png')\n",
    "plt.savefig(output_path, dpi=300,bbox_inches='tight')\n",
    "print(f\"Plot saved to: {output_path}\")\n",
    "plt.show()\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "6b669eaa",
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "# Round to the nearest 30-minute interval and average for Radiation\n",
    "veenkampen_net_rad['Timestamp'] = veenkampen_net_rad['Timestamp'].dt.floor('30T')\n",
    "veenkampen_net_rad = veenkampen_net_rad.groupby('Timestamp').mean().reset_index()\n",
    "\n",
    "# Merge with Veenkampen data based on 'Timestamp'\n",
    "merged_data = pd.merge(veenkampen_data, veenkampen_net_rad, on='Timestamp', how='inner')\n",
    "\n",
    "merged_data['G'] = -(merged_data['H'] + merged_data['LE'] + merged_data['Net_Rad'])#RN_1_1_1\n",
    "print(merged_data)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "de9d8f03",
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "# Round to the nearest 30-minute interval and average for Radiation\n",
    "loobos_net_rad['Timestamp'] = loobos_net_rad['Timestamp'].dt.floor('30T')\n",
    "loobos_net_rad = loobos_net_rad.groupby('Timestamp').mean().reset_index()\n",
    "\n",
    "# Merge with Veenkampen data based on 'Timestamp'\n",
    "merged_data_loobos = pd.merge(loobos_data, loobos_net_rad, on='Timestamp', how='inner')\n",
    "\n",
    "merged_data_loobos['G'] = -(merged_data_loobos['H'] + merged_data_loobos['LE'] + merged_data_loobos['Net_Rad'])\n",
    "print(merged_data_loobos)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7cac4897",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Round to the nearest 30-minute interval and average for Radiation\n",
    "amsterdam_net_rad['Timestamp'] = amsterdam_net_rad['Timestamp'].dt.floor('30T')\n",
    "amsterdam_net_rad = amsterdam_net_rad.groupby('Timestamp').mean().reset_index()\n",
    "\n",
    "# Merge with Veenkampen data based on 'Timestamp'\n",
    "merged_data_amsterdam = pd.merge(amsterdam_data, amsterdam_net_rad, on='Timestamp', how='inner')\n",
    "\n",
    "merged_data_amsterdam['G'] = -(merged_data_amsterdam['H'] + merged_data_amsterdam['LE'] - merged_data_amsterdam['Rnet'])\n",
    "print(merged_data_amsterdam)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "dc0e47cf",
   "metadata": {},
   "outputs": [],
   "source": [
    "# ------------------------------------------------------------\n",
    "fig, ax = plt.subplots(figsize=(10, 6))\n",
    "\n",
    "# Scatter for Veenkampen G (as residual)\n",
    "ax.scatter(\n",
    "    merged_data['Timestamp'],\n",
    "    merged_data['G'],\n",
    "    label='Veenkampen G (residual)',\n",
    "    color='blue',\n",
    "    s=40,\n",
    "    alpha=0.7,\n",
    "    marker='o'\n",
    ")\n",
    "\n",
    "# Scatter for Loobos G (as residual)\n",
    "ax.scatter(\n",
    "    merged_data_loobos['Timestamp'],\n",
    "    merged_data_loobos['G'],\n",
    "    label='Loobos G (residual)',\n",
    "    color='green',\n",
    "    s=40,\n",
    "    alpha=0.7,\n",
    "    marker='o'\n",
    ")\n",
    "\n",
    "# Scatter for Amsterdam G (as residual)\n",
    "ax.scatter(\n",
    "    merged_data_amsterdam['Timestamp'],\n",
    "    merged_data_amsterdam['G'],\n",
    "    label='Amsterdam G (residual)',\n",
    "    color='orange',\n",
    "    s=40,\n",
    "    alpha=0.7,\n",
    "    marker='o'\n",
    ")\n",
    "\n",
    "# Scatter for measured G values\n",
    "ax.scatter(\n",
    "    time,\n",
    "    G,\n",
    "    label='G (Measured)',\n",
    "    color='red',\n",
    "    s=40,\n",
    "    alpha=0.7,\n",
    "    marker='x'\n",
    ")\n",
    "\n",
    "# Title and labels in bold\n",
    "ax.set_title('Ground Heat Flux vs. Time', fontsize=20, fontweight='bold')\n",
    "ax.set_xlabel('Time', fontsize=20, fontweight='bold')\n",
    "ax.set_ylabel('G [W/m²]', fontsize=20, fontweight='bold')\n",
    "\n",
    "# Thicken all spines\n",
    "for spine in ax.spines.values():\n",
    "    spine.set_linewidth(1.5)\n",
    "\n",
    "# Tick parameters\n",
    "ax.tick_params(axis='both', which='major', labelsize=18, width=1.5, length=6)\n",
    "ax.tick_params(axis='both', which='minor', width=1.0, length=4)\n",
    "\n",
    "# X‐axis formatting: HH:MM every 2 hours, minor ticks every 30 min\n",
    "ax.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))\n",
    "ax.xaxis.set_major_locator(mdates.HourLocator(interval=2))\n",
    "ax.xaxis.set_minor_locator(mdates.MinuteLocator(interval=30))\n",
    "\n",
    "# Rotate x‐tick labels\n",
    "plt.xticks(rotation=45)\n",
    "\n",
    "# Y‐axis tick font size\n",
    "ax.tick_params(axis='y', labelsize=18)\n",
    "\n",
    "# Grid styling: dashed major, dotted minor\n",
    "ax.grid(True, which='major', linestyle='--', linewidth=0.8, alpha=0.7)\n",
    "ax.grid(True, which='minor', linestyle=':', linewidth=0.5, alpha=0.5)\n",
    "\n",
    "# Legend styling\n",
    "legend = ax.legend(loc='upper left', fontsize=14, frameon=True)\n",
    "legend.get_frame().set_linewidth(1.5)\n",
    "\n",
    "# Tight layout and save\n",
    "plt.tight_layout()\n",
    "output_path = os.path.join(data_dir, 'ground_heat_flux_vs_time.png')\n",
    "plt.savefig(output_path, dpi=300,bbox_inches='tight')\n",
    "print(f\"Plot saved to: {output_path}\")\n",
    "plt.show()\n",
    "\n",
    "\n"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.7"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
